-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
baseline filter #328
Comments
We have a field |
Awesome! Thanks! {"errors":[],"info":{"apiVersion":1,"dataVersion":1690103788,"deprecationDate":null,"deprecationInfo":null,"acknowledgement":null},"data":[{"samplingStrategy":"A","count":48019},{"samplingStrategy":"X","count":192119},{"samplingStrategy":"Y","count":44101},{"samplingStrategy":"N","count":314101},{"samplingStrategy":null,"count":7683436}]} |
The fields A,X,Y,N are shown only for data pulled from RKI (Germany's CDC) as opposed to Genbank. Their README is here: https://github.com/robert-koch-institut/SARS-CoV-2-Sequenzdaten_aus_Deutschland It's a bit scrambled, the sentences seem incomplete. I would say: I'm not sure about how reliable the annotation is though. I remember that when I looked into it a year ago, it seemed like representative sampling wasn't necessarily representative. I think the field was introduced back in the day when labs started to do variant PCRs to get a quick idea of which variant a patient - as variant PCR was as fast as PCR and less delay than waiting for whole genome sequencing. |
Ah thanks very much to you both. Since @chaoran-chen example uses the open API, I also was also wondering about the binding from the "purpose_of_sampling" tag in NCBI to the codes explained by @corneliusroemer 's link? |
Ah very nice @aswarren! The open data comes gets to LAPIS via nextstrain/ncov-ingest and I don't think we currently use that |
Pulling down surveillance from the API includes all sequences no matter the reason. In the case of the US / GISAID this includes traveller surveillance, which if estimating prevalence for a particular area, can give a very different picture than domestic spread. Is there a way to filter sequences based on baseline sequencing tag? If not it would be useful to have.
The text was updated successfully, but these errors were encountered: