Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DS-3636 Indexes and caching performance updates #1791

Conversation

jonas-atmire
Copy link
Contributor

Pull Request for JIRA ticket:
https://jira.duraspace.org/browse/DS-3636

Put indexes on several widely searched columns + use some additional caching for performance improvements.

Copy link
Contributor

@tomdesair tomdesair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a no-brainer. I only have a few small remarks.

DROP INDEX IF EXISTS resourcepolicy_rptype_idx;
DROP INDEX IF EXISTS resourcepolicy_action_idx;
DROP INDEX IF EXISTS resourcepolicy_resource_type_id_idx;
DROP INDEX IF EXISTS resourcepolicy_resource_id_idx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a drop if exists for "resourcepolicy_type_id_idx"

@@ -109,6 +108,37 @@ public boolean equals(Object obj)
}

/**
* Create a new resource policy based on this one but copy it to another DSpace object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify in the code comments why this is more efficient than just creating the record first and then setting all the field values (-> requires only insert statement vs 1 insert + 1 update).

CREATE INDEX group2groupcache_parent_id_idx ON group2groupcache (parent_id);
CREATE INDEX group2groupcache_child_id_idx ON group2groupcache (child_id);
CREATE INDEX bundle2bitstream_bitstream_order_idx ON bundle2bitstream (bitstream_order);
CREATE INDEX metadatavalue_mf_place_idx ON metadatavalue (metadata_field_id, place);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already an index on metadata_field_id, so maybe we should just index "place".

@jonas-atmire
Copy link
Contributor Author

Updated the indexes and code comments based on the feedback
Additionally, I noticed that the oracle flyway broke for both this and the previous flyway file
Fixed the following,

  • Added slashes between index checking declarations
  • Updated some of the index names (These are required to be under 30 characters in Oracle)

@alanorth
Copy link
Contributor

alanorth commented Feb 28, 2018

I have only tested the indexes, but this pull request is a no-brainer for me as well. It greatly increases the performance of the XMLUI /submissions page in DSpace 5.5, for example. Before this, I observed hundreds or even thousands of locks on the resourcepolicy and metadatavalue tables during operations on that page. Deleting unfinished submissions would often take several minutes on beefy hardware with modern PostgreSQL 9.5, but complete almost instantly with these indexes.

Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly OK, but I need clarification on why we are modifying an old migration script. See inline comments.

@@ -19,5 +19,5 @@ begin
exception
when index_not_exists then null;
end;

/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change required to this old Migration script? Modifying a previous migration script, even by a single character will change the checksum of that script. That means that Flyway will no longer be able to validate this migration on databases that previously ran it. So, with this one character change, anyone running 5.7 or 5.8 will be required to run a dspace database repair in order to upgrade successfully. See https://wiki.duraspace.org/display/DSDOC6x/Database+Utilities

@tdonohue
Copy link
Member

@tomdesair : just a reminder for you to dismiss your "requested changes" review if you are now satisfied with this PR. It looks to me like the changes you requested have now been made, but I hesitate to dismiss the review of any other Committer.

@tdonohue tdonohue added bug backend: Oracle Related to Oracle database backend backend: Postgres Related to PostgreSQL database backend backend: FlywayDB Related to database migrations or Flyway labels Aug 22, 2018
@J4bbi
Copy link
Contributor

J4bbi commented Nov 23, 2020

@kshepherd, should we consider this for 6.4? I haven't tried it yet but it seems relatively straightforward

if something along similar lines hasn't already been applied, it might be worth porting to 7? @tdonohue

@kshepherd
Copy link
Member

@kshepherd, should we consider this for 6.4? I haven't tried it yet but it seems relatively straightforward

if something along similar lines hasn't already been applied, it might be worth porting to 7? @tdonohue

It looks good - is there a dspace 6 port?

@J4bbi
Copy link
Contributor

J4bbi commented Nov 23, 2020

Yeah, #1792 ...... but like I say, haven't tested this myself....

@tdonohue tdonohue closed this Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: FlywayDB Related to database migrations or Flyway backend: Oracle Related to Oracle database backend backend: Postgres Related to PostgreSQL database backend bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants