Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

registry-manager won't set archive-status on some collections #97

Closed
Tracked by #30
rchenatjpl opened this issue Sep 8, 2022 · 12 comments
Closed
Tracked by #30

registry-manager won't set archive-status on some collections #97

rchenatjpl opened this issue Sep 8, 2022 · 12 comments
Assignees
Labels
B13.1 bug Something isn't working i&t.done s.high High severity

Comments

@rchenatjpl
Copy link

'registry-manager set-archive-status' fails to do so for some collections for no reason I can see.

📜 To Reproduce

Steps to reproduce the behavior:
From pds4@pdscloud-prod1:

% JAVA_HOME="/home/pds4/jvm/jdk-11.0.15.1"
% cd test
% registry-manager delete-data -es "https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com:443" -all -index registry -auth auth.txt
% /usr/local/build11/harvest-3.6.0/bin/harvest -c **contextPds4.xml** > ../v4.out
% registry-manager set-archive-status -es "https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com:443" -status archived -lidvid "urn:nasa:pds:context::1.3" -auth auth.txt
[INFO] Setting product status. LIDVID = urn:nasa:pds:context::1.3, status = archived
[WARN] Collection urn:jaxa:darts:context:agency::1.0 doesn't have primary products.
[WARN] Collection urn:nasa:pds:context:investigation::11.0 doesn't have primary products.

I don't know why those two WARNings come up. Maybe registry-manager doesn't like the bundle's naughtiness in that more than one collection sits in a directory. Something else? So I wanted to pare down the input directory to help find the bug, so I copied the bundle's entire directory and removed two probably uninvolved directories, but then a different collection ended up in the WARNing:

% diff contextPds4.xml killme.xml
25c25
<     <path>/data/pds4/1700/PDS4_context_bundle_20180723</path>  <!-- 5000+ xml files-->
---
>     <path>/data/pds4/1700/killme/</path>
% diff -r /data/pds4/1700/PDS4_context_bundle_20180723/bundle_context.xml /data/pds4/1700/killme/bundle_context.xml
134,138d133
<         <lid_reference>urn:nasa:pds:context:personnel</lid_reference>
<         <member_status>Primary</member_status>
<         <reference_type>bundle_has_context_collection</reference_type>
<     </Bundle_Member_Entry>
<     <Bundle_Member_Entry>
145,149d139
<         <member_status>Primary</member_status>
<         <reference_type>bundle_has_context_collection</reference_type>
<     </Bundle_Member_Entry>
<     <Bundle_Member_Entry>
<         <lid_reference>urn:nasa:pds:context:telescope</lid_reference>
Only in /data/pds4/1700/PDS4_context_bundle_20180723: personnel-affiliate
Only in /data/pds4/1700/PDS4_context_bundle_20180723: telescope
% registry-manager delete-data -es "https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com:443" -all -index registry -auth auth.txt
% /usr/local/build11/harvest-3.6.0/bin/harvest -c killme.xml > ../v4k.out 
% registry-manager set-archive-status -es "https://search-en-prod-di7dor7quy7qwv3husi2wt5tde.us-west-2.es.amazonaws.com:443" -status archived -lidvid "urn:nasa:pds:context::1.3" -auth auth.txt
[INFO] Setting product status. LIDVID = urn:nasa:pds:context::1.3, status = archived
[WARN] Collection urn:nasa:pds:context:facility::5.0 doesn't have primary products.

Long story short: why does registry-manager not process the two collections the first time and an entirely different collection the second time?

🕵️ Expected behavior

registry-manager should have been able to set every product's archive-status.

📚 Version of Software Used

Harvest version: 3.6.0
Build time: 2022-04-13T17:42:54Z
Registry Manager version: 4.4.0
Build time: 2022-04-13T18:24:45Z

🩺 Test Data / Additional context

Please be careful. The first test above accesses the operational bundle of context products. And I put the killme/ directory in an operational area.


🦄 Related requirements

⚙️ Engineering Details

@rchenatjpl rchenatjpl added bug Something isn't working needs:triage labels Sep 8, 2022
@jordanpadams jordanpadams added B13.0 s.high High severity and removed needs:triage labels Sep 9, 2022
@jordanpadams
Copy link
Member

@alexdunnjpl once we get the DOI Service updates wrapped up. we are going to move over the registry loader tools 🎉

@tloubrieu-jpl
Copy link
Member

@alexdunnjpl could you look at this ticket when you have a chance ? After the version bug would be a good time, this bug is higher priority.

@tloubrieu-jpl
Copy link
Member

hopefully it is fixed by #118 ...

@alexdunnjpl
Copy link
Contributor

@rchenatjpl is it possible for you to repeat these tests with a build from the latest snapshot version of registry-common? No problem if not, it just saves some risk re "be careful".

@rchenatjpl
Copy link
Author

@alexdunnjpl Sure, but I'll need a little time, both because I'm busy and because I have no memory of this issue.

@alexdunnjpl
Copy link
Contributor

@rchenatjpl No worries - alternatively (if it's faster), if you could run the commands again and provide a full output, and flick me the bundle in question, I can start taking a look myself - I have access to pdscloud-prod1, but apparently not to the pds4 user (given that it's asking for a password).

If it's just warning messages and not an actual failure to process, there may not be anything to fix - it's not totally clear from the original post.

@rchenatjpl
Copy link
Author

Hi, @alexdunnjpl, are you ok if I punt on this? I don't know how to build from the source code, and I'm not sure what I'd be building - registry-manager, not harvest, right? And looking back at my test procedure, I think there's a decent chance I'd be wiping out other people's data.

@alexdunnjpl
Copy link
Contributor

@rchenatjpl for sure - in that case could you LFT me the affected bundle so I can examine/test locally? I don't have access to that host.

@rchenatjpl
Copy link
Author

Holy cow, it's hard to reconstruct what I did. I think I left the test config file in pdscloud-prod1:/home/pds4/test/, but it's not there now. It looks like I was trying to ingest the bundle with all context products, which from pdscloud-prod1 or -prod2 or -gamma is at /data/pds4/context-pds4/, though there was a soft link problem, so /data/pds4/1700/PDS4_context_bundle_20180723/
For that, hopefully the attached config file works.
test1.xml.zip

@alexdunnjpl
Copy link
Contributor

alexdunnjpl commented Dec 20, 2022

With the content from https://pds.nasa.gov/data/pds4/1700/PDS4_context_bundle_20180723/ ingested using the provided config (with updated paths) and current/snapshot versions of registry and registry-common the given command successfully completes with no warnings.

@jordanpadams @rchenatjpl suggest closing as already fixed.

@jordanpadams
Copy link
Member

Closed per other big fixes since 3.6.0 release

@jordanpadams
Copy link
Member

thanks for checking on this @alexdunnjpl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B13.1 bug Something isn't working i&t.done s.high High severity
Projects
None yet
Development

No branches or pull requests

6 participants