Skip to content
This repository has been archived by the owner on Dec 4, 2023. It is now read-only.

WAF link fixes #51

Merged
merged 3 commits into from
Aug 3, 2018
Merged

WAF link fixes #51

merged 3 commits into from
Aug 3, 2018

Conversation

benjwadams
Copy link
Contributor

Fixes some bugs mostly related to harvesting of ERDDAP 1.82 WAF harvesting

Determines the ERDDAP version running by scraping the HTML, then adds
 conditional handling to fetch ERDDAP WAF links depending on the version
 running:
 version < 1.82: Fetches links from <pre> element
 version >= 1.82: Fetches links from <table> inside <div> element
@ghost ghost assigned benjwadams Aug 2, 2018
@ghost ghost added the review label Aug 2, 2018
@benjwadams
Copy link
Contributor Author

Related to ioos/catalog#64 .
Needs review from someone else (whom?) working on catalog.

@benjwadams benjwadams changed the title Waf link fixes WAF link fixes Aug 2, 2018
@rsignell-usgs
Copy link
Member

I would hope that someone from @axiom-data-science could take a look here since they apparently have customized the ERDDAP services deployed for CENCOOS and SECOORA, and would have some insight as to why CENCOOS is working and SECOORA is not working (ioos/catalog#64 (comment))

@benjwadams
Copy link
Contributor Author

benjwadams commented Aug 3, 2018

Parent/referring issue should describe the error condition we're running into. I ran locally with these changes and was able to get things harvested properly.

@mwengren
Copy link
Member

mwengren commented Aug 3, 2018

@benjwadams If you think it works via your local testing, lets go ahead and merge and run on the live Registry. There are only a few ERDDAP WAF sources in there, so if there happens to be an issue it will not be a back breaker.

Removes re module in favor of using str `in` to detect ERDDAP version.

Adds function to prevent failure if text is None when scanning for text
nodes ending in '.xml'

Fixes a couple indentation issues.
@benjwadams benjwadams merged commit 6d7780c into ioos:master Aug 3, 2018
@ghost ghost removed the review label Aug 3, 2018
@kwilcox
Copy link
Member

kwilcox commented Aug 3, 2018

@rsignell-usgs There is no difference between the CeNCOOS and SECOORA ERDDAP servers.

I harvest all of the WAFS defined in the registry fine (including ERDDAP):

for x in $(curl 'https://registry.ioos.us/api/v1/Harvests' | jq -r '.data[].url'); do
    wget -r -np -A "*.xml" $x
done

@rsignell-usgs
Copy link
Member

@kwilcox , okay, thanks! So @benjwadams , can't we just look at what's happening with CENCOOS and do the same for SECOORA?

@benjwadams
Copy link
Contributor Author

@rsignell-usgs , the code merged in here has been moved to production and now is working again. Either the WAF or ERDDAP-WAF harvest types should work for ERDDAP 1.82 now. The previous errors were related to fetching <a> elements that had no href and attempting to call a string method on Python None type, which led to an unhandled exception, but are now tested for.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants