Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move seaspan datasets to datasets_development folder #95

Closed
wants to merge 1 commit into from

Conversation

JessyBarrette
Copy link
Collaborator

@JessyBarrette JessyBarrette commented Jul 28, 2022

Based on some exchange with @steviewanders, it will be tricky on the servers end to be able to start relying on the ERDDAP authorization system at least in the near future.

Goose ERDDAP

Due to that, we will rely on the goose development erddap and its google authorization to serve the protected data (for now). Using this feature has a few downsides:

  • We are not able to retrieve the data through the ERDDAP api without manually retrieving an authorization. With ERDDAPs authorization system we could potentially avoid it with the custom passwords or https://coastwatch.pfeg.noaa.gov/erddap/download/AccessToPrivateDatasets.html. Since goose erddap right now isn't using the ERDDAP protection, we can't really use any suggestions present in the link above.

  • Development ERDDAP (goose) is using the development branch, while production erddap servers the main branch. This will bring the issue of either keeping some datasets only within the development branch or having datasets continuously failing in the production erddap.

Suggestion: create a new datasets_development folder

We will keep the protected datasets available on Goose only and keep their associated dataset's xml in a dataset_development folder. Goose will concatenate the xmls in the datasets and datasets_development servers while production will only use the datasets folder.

That way, we can keep the two branches near each other and be sure that none of the protected datasets will ever make it to the public production server.

TO DO on servers

  • Change goose erddap script that concatenate the different datasets xml to consider also the folder dataset_development.

@n-a-t-e
Copy link
Member

n-a-t-e commented Jul 28, 2022

We are not able to retrieve the data through the ERDDAP api without manually retrieving an authorization. With ERDDAPs authorization system we could potentially avoid it with the custom passwords or https://coastwatch.pfeg.noaa.gov/erddap/download/AccessToPrivateDatasets.html. Since goose erddap right now isn't using the ERDDAP protection, we can't really use any suggestions present in the link above.

We can also just switch Goose to basic authentication, which is what we use with CIOOS Pacific. This allows you to make API calls easily, using a username and password. We can also add IPs to an allow-list (like we do in CIOOS) so that you don't need to use authentication at all in your script.

Development ERDDAP (goose) is using the development branch, while production erddap servers the main branch. This will bring the issue of either keeping some datasets only within the development branch or having datasets continuously failing in the production erddap.

This is how we've done it so far, both with Hakai and CIOOS Pacific. Eg currently there are 28 datasets on Goose and 20 on production

Change goose erddap script that concatenate the different datasets xml to consider also the folder dataset_development.

This is now built in to the docker image, see axiom-data-science/docker-erddap#48

@JessyBarrette
Copy link
Collaborator Author

JessyBarrette commented Jul 28, 2022

We can also just switch Goose to basic authentication, which is what we use with CIOOS Pacific. This allows you to make API calls easily, using a username and password. We can also add IPs to an allow-list (like we do in CIOOS) so that you don't need to use authentication at all in your script.

We could I don't have a preference, though specifc IP addresses could be annoying sometimes. I pull data from the cioospacific dev erddap server with great success, this can be helpful when developing the dataset and extracting more information than the ERDDAP can by itself.

This is how we've done it so far, both with Hakai and CIOOS Pacific. Eg currently there are 28 datasets on Goose and 20 on production.

Yes, but the ultimate goal is to have the 25 moved to production (I removed the non public ones from the count)

This is now built in to the docker image, see axiom-data-science/docker-erddap#48
As far as I understand

As far as I understand, this is reproducing what we already have. What I'm suggesting is having two folders /datasets.d

Development harvest from the two directory, while production only from one.

@JessyBarrette
Copy link
Collaborator Author

This will get closed until we decide to add protected datasets

@JessyBarrette JessyBarrette deleted the add-datasets-development branch August 16, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants