Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update release date handling for AIRR template #32

Closed
martinjoconnor opened this issue Jul 19, 2018 · 5 comments
Closed

Update release date handling for AIRR template #32

martinjoconnor opened this issue Jul 19, 2018 · 5 comments
Assignees
Milestone

Comments

@martinjoconnor
Copy link
Member

martinjoconnor commented Jul 19, 2018

Problem described in following email to Yuriy at NCBI (with suggested solution confirmed as appropriate by him):

Our understanding is that the pubic release of, say, an SRA submission entry will force the release of referenced BioSample submission entries, which will in turn force the public release of the reference BioProject. So, even if, for example, a BioProject submission has a future release date, the public release of a BioSample that references it will effectively make the BioProject public irrespective of its release date.

Is this understanding correct?

The reason this is an issue for us is that we are generating a BioProject/BioSample/SRA submission from the AIRR specification and it includes only a BioSample release date - and has no overall submission release date or SRA entry release date. Since the SRA parts of the submission have no release date we are assuming that they are released immediately - which forces release of the referenced BioSamples and in turn the BioProject, effectively making the entire submission public immediately.

We are assuming that we should include release dates for each BioProject, BioSample, and SRA entry to fully control the pubic release dates? 
@martinjoconnor martinjoconnor added this to the 2.0 milestone Jul 19, 2018
@martinjoconnor martinjoconnor changed the title update release date handling for AIRR Update release date handling for AIRR Jul 19, 2018
@martinjoconnor martinjoconnor changed the title Update release date handling for AIRR Update release date handling for AIRR template Jul 19, 2018
@martinjoconnor
Copy link
Member Author

@graybeal
Copy link
Member

Based on initial reaction I suspect doing a single release date for everything makes most sense.

@marcosmro
Copy link
Member

I have updated the submission server to work with the latest MiAIRR template and to use the top-level release date for everything.

I have started a test submission with a sample instance to check that everything works fine. However, I found a minor issue in the template that should be fixed before users start populating it: The top-level release date is a text field. I think that it should be a date field to avoid wrong date formats that will break the submission. If you agree, I will update it.

Apart from that, I found some model issues, probably because the elements were created some time ago following an old version of the model:

  • The field “Number of Cells per Sequencing Reaction” contains an @type (instance type). We need to remove it, because the editor understands that the value of the field should be a URI.
  • “Cell Subset”.“@id” should not be required
  • “Diagnosis1”.“@id” should not be required
  • The following properties should not be required by the BioSample element: schema:isBasedOn, schema:name, schema:description, pav:createdOn, pav:createdBy, pav:lastUpdatedOn, oslc:modifiedBy. (Schema path: #/properties/BioSample for AIRR NCBI/items/required).

I can fix these issues too and create a new copy of the MiAIRR template. I will probably fix them first in the elements at the MongoDB level and then I will regenerate the template using the Template Editor. If there are any existing instances for the template, they will have to be updated.

@graybeal
Copy link
Member

graybeal commented Aug 2, 2018

Summarizing the email I sent you:

  • The current date type doesn't have a widget, also doesn't do validation, and gets the date wrong by a day. So while it may be a good idea to change to date anyway (so we don't have to change the template again later) be aware it won't fix anything right away.
  • Please update the last modified dates of the elements even if done in MongoDB.
  • Likely the only two significant instances are for Hailong and Anne, locations provided separately. Should check for any new instances. (Anything in my directory path does not have to be updated, it is for testing/development only.)

I just looked at a few of the linked BioSamples and they are accessible publicly. Is that what we expect? The example submission looks very un-demo-like, actually. Begining reads:

This is an automatic acknowledgment that your recent submission to the BioSample database has been successfully processed and will be released on the date specified.

BioSample accessions:			SAMN02181721, SAMN02181722, SAMN02181723, SAMN02181724, SAMN02181725, SAMN02181726, SAMN02181727, SAMN02181728, SAMN02181729, SAMN02181730, ... see attached file.
Temporary SubmissionID:	SUB424311
Release date:			2020-07-07-07:00, or with the release of linked data, whichever is first

An example link is
https://www.ncbi.nlm.nih.gov/biosample/2181732
(and down through ...721). Which points to this project:
https://www.ncbi.nlm.nih.gov/bioproject/205706

Is that what you submitted? Should I be able to see it?

@marcosmro
Copy link
Member

@graybeal Thanks for your comments. The submission was targeted to NCBI's TEST folder, so it shouldn't be publicly accessible.

Regarding the message that you received, the submissionID and the release date correspond to my submission, but the Biosample accessions don't. The information from the links is not familiar to me either. That's not our BioProject. That biosample was submitted in 2013.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants