Skip to content
This repository has been archived by the owner on Jan 8, 2022. It is now read-only.

Eliminate hydrus dependency on lyberservices-prod #457

Open
6 of 7 tasks
peetucket opened this issue Jul 31, 2020 · 22 comments · Fixed by #461 · May be fixed by #458
Open
6 of 7 tasks

Eliminate hydrus dependency on lyberservices-prod #457

peetucket opened this issue Jul 31, 2020 · 22 comments · Fixed by #461 · May be fixed by #458

Comments

@peetucket
Copy link
Contributor

peetucket commented Jul 31, 2020

We are trying to turn off lyberservices-prod, but Hydrus seems to be making requests to it still. Look at the logs and try to figure out why and also if the depedency is required for Hydrus to continue operating.

Log on lyberservices-prod: /var/log/httpd/access_log

See slack thread: https://stanfordlib.slack.com/archives/C0RK5EM9N/p1596151403493200?thread_ts=1596144800.488800&cid=C0RK5EM9N

See archeology in comments below. Seems to be related to indexing. An attempt was made to add indexing code to Hydrus directly but this caused other problems. See PRs #461 and then #471 which reverted the work.

Next steps to try with the ultimate goal being turn off lyberservices-prod while keeping Hydrus operational.

  • coordinate a time with Hydrus service team (e.g. Hannah, Amy, Andrew) and Ops to temporarily turn off lyberservices-prod and see what the side effects are
  • same, but instead of turning off lyberservices-prod completely, add a redirect to a new server than can respond to the endpoints hit in the logs with a simple 200, see what side effects on hydrus are

Things to test in Hydrus each time

  • create a new object in hydrus
  • version an object in hydrus
  • view homepage
  • view a collection
  • edit a collection
@peetucket
Copy link
Contributor Author

Looks like a request to get workflows from a hydrus object:

10.111.6.163 - - [31/Jul/2020:13:45:35 -0700] "GET /workflow/dor/objects/druid:yh041gt7606/workflows HTTP/1.1" 200 2060

Hydrus Object: https://argo.stanford.edu/view/druid:yh041gt7606

@peetucket
Copy link
Contributor Author

10.111.6.163 is sul-dor-prod

@andrewjbtw
Copy link

When I looked at this yesterday, it looked like the hydrus druid URLs that get hit are

item druid
collection druid (for that item)
APO druid (for that collection)

@peetucket
Copy link
Contributor Author

The workflows datastream in that Fedora object has the lyberservices-prod URL in it (since it is an "external" databastream). When you view that datastream in Fedora admin, it triggers the call which you can see in the logs

@peetucket
Copy link
Contributor Author

I bet that URL is embedded in all fedora objects in that workflows datastream (which I believe is now no longer used). Maybe Hydrus shows it somewhere, triggering the request?

@peetucket
Copy link
Contributor Author

This page is where I thought the request might be coming from, but it is blank for me and doesn't seem to trigger the request in the log: https://sdr.stanford.edu/items/druid:yh041gt7606/datastreams

@andrewjbtw
Copy link

andrewjbtw commented Jul 31, 2020

Yesterday I made a change to this test collection and saved it: https://sdr.stanford.edu/collections/druid:gs556yg7381 But I didn't change any items in that collection.

That led to the collection druid workflows URL and its APO druid workflows URL showing up in the log. But I didn't check until after I made the edit and saved it, so I don't know when exactly the http requests were made.

@peetucket
Copy link
Contributor Author

When I looked at this yesterday, it looked like the hydrus druid URLs that get hit are

item druid
collection druid (for that item)
APO druid (for that collection)

Can you clarify what you mean by this? Looking at the logs for lyberservices-prod, it appears that a request for the workflows datastream is what is appearing?

@andrewjbtw
Copy link

Sorry, I meant that the druids identified in the workflows datastreams in the log corresponded to those items. Not that URLs for those druid item pages were in the log.

@peetucket
Copy link
Contributor Author

Opened a new version for this test item and saw the requests come in:

https://sdr.stanford.edu/items/druid:ch858tj0413

10.111.6.163 - - [31/Jul/2020:14:54:28 -0700] "GET /workflow/dor/objects/druid:ch858tj0413/workflows HTTP/1.1" 200 7946
10.111.6.163 - - [31/Jul/2020:14:54:30 -0700] "GET /workflow/dor/objects/druid:ch858tj0413/workflows HTTP/1.1" 200 7946
10.111.6.163 - - [31/Jul/2020:14:54:58 -0700] "GET /workflow/dor/objects/druid:ch858tj0413/workflows HTTP/1.1" 200 7946

So related to opening a new version in Hydrus.

@peetucket
Copy link
Contributor Author

Saving the object makes the requests appear to. So perhaps it is a save.

@peetucket
Copy link
Contributor Author

Loading the hydrus item manually on the rails console and saving it triggers the requests in lyberservices-prod logs:

i=Hydrus::Item.find('druid:ch858tj0413')
i.save

leads to this in the logs for lyberservices-prod at /var/log/httpd/access_log

10.111.6.163 - - [31/Jul/2020:14:56:57 -0700] "GET /workflow/dor/objects/druid:ch858tj0413/workflows HTTP/1.1" 200 7946
10.111.6.163 - - [31/Jul/2020:15:02:27 -0700] "GET /workflow/dor/objects/druid:ch858tj0413/workflows HTTP/1.1" 200 7946
10.111.6.163 - - [31/Jul/2020:15:02:29 -0700] "GET /workflow/dor/objects/druid:ch858tj0413/workflows HTTP/1.1" 200 7946

@peetucket
Copy link
Contributor Author

Saving the object also shows some deprecation warnings...could be unrelated but perhaps we are running older versions of gems that are triggering the calls to lyberservices... will check gem versions:

DEPRECATION WARNING: you provided 2 args, but active_lifecycle now takes kwargs. (called from milestones at /opt/app/hydrus/hydrus/shared/bundle/ruby/2.7.0/gems/dor-workflow-client-3.22.0/lib/dor/workflow/client.rb:47)
DEPRECATION WARNING: passing the repo parameter to active_lifecycle is no longer necessary. This will raise an error in dor-workflow-client version 4. (called from milestones at /opt/app/hydrus/hydrus/shared/bundle/ruby/2.7.0/gems/dor-workflow-client-3.22.0/lib/dor/workflow/client.rb:47)
DEPRECATION WARNING: Dor::ReleaseTagService is deprecated and will be removed in dor-services 9.0. (it's moving to dor-services-app). (called from new at /opt/app/hydrus/hydrus/shared/bundle/ruby/2.7.0/gems/dor-services-8.6.0/lib/dor/services/release_tag_service.rb:9)
DEPRECATION WARNING: Dor::ReleaseTags::Purl is deprecated and will be removed in dor-services 9.0. (it's moving to dor-services-app). (called from new at /opt/app/hydrus/hydrus/shared/bundle/ruby/2.7.0/gems/dor-services-8.6.0/lib/dor/services/release_tag_service.rb:15)

@peetucket
Copy link
Contributor Author

Using Argo to compare with. Argo is currently on dor-services 9.5.0, but hydrus is locked to ~> 8.0 in the Gemfile and is on 8.6.0

Argo and hydrus are both on the same version for dor-workflow-client and dor-services-client.

So suspicion falls on dor-services v8 doing something with the workflows datastream on the save.

@peetucket
Copy link
Contributor Author

I suspect we need to update Hydrus to use dor-services v9. Not sure of the implications of this though, will benefit from further team discussion. Put up a simple PR here to see what happens when the tests are run with the newer version of dor-services: #458

@aaron-collier aaron-collier self-assigned this Aug 12, 2020
mjgiarlo added a commit that referenced this issue Aug 26, 2020
Because it breaks production and people have work to do.

Re-opens #457 which will be discussed on Tuesday.
@mjgiarlo mjgiarlo reopened this Aug 26, 2020
@mjgiarlo
Copy link
Member

mjgiarlo commented Aug 26, 2020

We will discuss an approach to this on Tuesday Sept 1st.

mjgiarlo added a commit that referenced this issue Aug 26, 2020
Because it breaks production and people have work to do.

Re-opens #457 which will be discussed on Tuesday.
@peetucket peetucket changed the title Trace requests going to lyberservices-prod Trace hydrus requests going to lyberservices-prod Sep 1, 2020
@peetucket peetucket changed the title Trace hydrus requests going to lyberservices-prod Eliminate hydrus dependency on lyberservices-prod Sep 1, 2020
@jmartin-sul
Copy link
Contributor

we'll try this in stage first, possibly editing an existing stage object to use an older lyberservices URL.

@mjgiarlo
Copy link
Member

I have reached out to @andrewjbtw @hannahfrost @amyehodge and @sul-dlss/operations on Slack to coordinate a time to conduct this testing.

@mjgiarlo
Copy link
Member

@peetucket @jmartin-sul:

I worked with @amyehodge @tallenaz and @andrewjbtw to decommission lyberservices-test and -prod, and we can confirm that all the tests identified above worked just fine. No weird new behavior, and no exceptions in honeybadger. We will leave the lyberservices boxes off but leave them around for three weeks, and then re-assess. That way if they are needed in a pinch, we can turn them back on.

NOTE: we did not do this part:

add a redirect to a new server than can respond to the endpoints hit in the logs with a simple 200, see what side effects on hydrus are

Is that OK?

@mjgiarlo
Copy link
Member

@tallenaz will check back on this in October and if none of us have lingering concerns, he will decommission the VMs for good. At that point, this issue can be closed.

(This is no longer being actively tracked by the @sul-dlss/infrastructure-team FR.)

@peetucket
Copy link
Contributor Author

Rock on.

NOTE: we did not do this part:

add a redirect to a new server than can respond to the endpoints hit in the logs with a simple 200, see what side effects on hydrus are

Is that OK?

Yes, this second bullet point was only in case the switching off entirely generated odd behaviors. If it doesn't, I don't see a need to do this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
5 participants