-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reharmonising a file with error #1226
Comments
When unzipped two files appeared. I fixed the header for both but the metadata yaml files are missing. Since the study is old, the data is not available from the ingest api. |
|
Submitted to codon with the submission script
|
|
|
Using the script at
|
Added
|
Harmonised files, metadata files, running logs and .tbi files are copied to the respective harmonised directories. |
This is confirmed done, @earlEBI will double check |
Reopening as the yaml files do not look quite right. (is_harmonised = false). |
Fixed the following fields:
@earlEBI Could you check again please? Thanks. |
@earlEBI please confirm |
I thought that's because it's a very old submission. And also they are not available in the ingest api. @sajo-ebi https://www.ebi.ac.uk/gwas/ingest/api/v2/studies/GCST008396 |
Old studies are meant to be retrieved fromthe public rest API: https://www.ebi.ac.uk/gwas/rest/api/studies/GCST008396 |
TODO: Update sumstats tools so that we fetch the REST API if Ingest API does not return any data. |
Harmonization done, but yaml file has some missing data. |
Regenerated YAML files for GCST002047 and GCST008396. Expect them in the public ftp in 2 days. |
YAML files are in staging FTP but not in public FTP. The reason why it didn't sync is in our ftp-sync code, we only filter the files that start with 'GCST*'. See https://github.com/EBISPOT/gwas-utils/blob/6fbf2c7a6d6fdfc79e0b8c2d1e74539bb1073303/ftpSummaryStatsScript/ftp_sync.py#L186-L188 Will renamed files, expect them in the public ftp in 2 days. |
Agreed to keep original files as per old guidelines |
GCST002047 was not harmonised successfully because our harmonisation pipeline cannot recognise the column “Effect_Allele”. The harmonisation pipeline reads the “effect_allele” column in the input file to harmonise the variant. However, all data in this column is NA. This is the reason why all variants give hm 14. If we change the header of this file, it should be able to be harmonised. (same as other_allele)
Please fix the file and re-qeue for harmonisation
The text was updated successfully, but these errors were encountered: