allow upload and download of JSON for functional annotation #2601

nathandunn · 2021-03-31T19:08:49Z

Should support both downloading and uploading as JSON of functional annotation data. Minimally this will include:

GO
Gene Product
Provenance

May also include attributes / PMID, etc. but maybe not necessary if not immediate need.

nathandunn · 2021-03-31T19:56:12Z

from google doc:

I can add this to the user-interface pretty easily. However, I have some questions:
(1) do we upload all GO annotations at once, individually, etc. and then separately for Gene Product and Provenances, etc. or do we upload them all together? If done individually are you doing them one annotation at a time or in a batch together?
(2) puting this in the UI is very do-able, but it seems like if you are getting it from a remote service and wanting to plug it in here, doing it via a script that pulls from one web-service and adds to another might make more sense. Perhaps even a "load annotation from URL"? If not, doing it from the UI is easy.

nathandunn · 2021-04-12T19:09:26Z

Funtional_annotation_workflow.pdf

response:

All data should be loaded from a single JSON file
when the data is exported from main database they may contain errors and be incomplete. I.e some organisms was annotated a long time ago and does not have the with_in key word. Other problems are GO terms without evidence and obsolete GO terms. Would it be possible to load the data directly into the webform without checking, so the annotators can correct them before they a saved to the database.

nathandunn · 2021-04-12T19:15:04Z

@mbc32 I would propose loading the output JSON into its own app (or even something like the JSON beautifier) in tree mode.

Once fixed and validated, I would use the web service to the JSON feature by feature.

If its too slow, I can write an end-point for bulk loading. We might be able to add one to the python-apollo library as well.

Anyway, that is my 2 cents on that, but happy to discuss further.

mbc32 · 2021-05-04T15:18:22Z

Hi @nathandunn
I you could implement the below functionality before you leave it would be very helpful. If the functionality is there we can modify it later when the format of the JSON file has been finalized. If you need to information please ask me.
Notes from VEuPathDB Apollo meeting:
The aim is to make a mechanism which enable the annotator to load functional annotation for a single gene via the user interface (Open annotation) from a JSON file.

There should be one mechanism to load 'GO', 'Gene Product', 'Provenance' from a single JSON file.
The format of the JSON file can be finalized later.

nathandunn · 2021-05-04T15:37:14Z

FYI @rbuels

nathandunn · 2021-05-04T15:49:53Z

@mbc32 I'd like to have an idea of the JSON you'll have to upload so that you know what it converts into.

If we are going to do it this way, I would probably do something like:

{ go:  [ { } , {} ], provenance: [{ }, {}], gene_product: [{}, {}] }

where the empty objects are the valid annotations already supported by the existing web services. That being, if you are pulling these out of a database, it would be trivial (and possibly cleaner) to call a web service to do the same thing, but I'm unsure what the curator workflow is, how they are pulling JSON, etc. They would have to be aware of the uniqueName, however.

I could also add it here: https://github.com/galaxy-genome-annotation/python-apollo/

so the command would be: arrow annotations add_go <json_file> etc.

nathandunn · 2021-05-04T15:53:03Z

What are you pulling in from the existing database?

mbc32 · 2021-05-04T16:13:05Z

Hi @nathandunn

The JSON schema looks fine to me
We would be getting all existing functional annotation for a gene, then correct it or add to it in apollo.
Would there be any may to load the data direct into the web-forms from a JSON file. Bypassing the database and data checks?
For this to work the annotator should not be required to find and copy the uniqueName.

nathandunn · 2021-05-04T17:05:53Z

That makes sense. Could you provide a sample json that you pull from the functional annotation database so that I can make sure we can convert it? Thanks, Nathan

…

On May 4, 2021, at 9:13 AM, Mikkel Christensen ***@***.***> wrote: Hi @nathandunn The JSON schema looks fine to me We would be getting all existing functional annotation for a gene, then correct it or add to it in apollo. Would there be any may to load the data direct into the web-forms from a JSON file. Bypassing the database and data checks? For this to work the annotator should not be required to find and copy the uniqueName. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

mbc32 · 2021-05-05T16:10:11Z

Hi @nathandunn
I made a JSON schema and example:
apollo_FA_schema_example.tar.gz

There are some point

Each section must be optional i.e. for now no genes in VEupathDB has provenance
For genes we also need gene name and symbol, but that is a different part of the web service Annotation Service (setName, setSymbol). Not sure how to include them
A gene can have several transcript ISO from. Not sure if we should nest the JSON. {gene:[{/transcript/},{}]}

nathandunn · 2021-05-05T16:34:20Z

@mbc32 I think your schema makes sense. What I'm missing is the process. I'm a little Leary of pushing raw JSON through a UI (that's why we have a UI!), though we can do it if that is what we do.

My understanding of the process is:

you create an annotation in Apollo from structural data
you pull the functional annotations form an existing database where symbol and name match
you push the functional annotation into Apollo matching the name / symbol, etc.

I'm assuming you will do this at the gene and transcript level both?

mbc32 · 2021-05-06T11:15:41Z

Hi @nathandunn ,
I agree the workflow may be odd. What we are trying to do is having one place and one place only where all the correct information is together at the same time.
workflow

Export existing functional annotation from our main database to JSON. i.e. A gene with GO annotation
Import the JSON into Apollo.
Add or update functional annotation. i.e. update product name, add new GO term, delete one GO term
Export the functional annotation from Apollo via GFF
Loading the new functional annotation into our main database overwriting any existing annotation

nathandunn · 2021-05-06T15:14:12Z

@mbc32 I think your workflow makes a lot of sense.

I'm just wondering for step 2 if having a command-line interface would make more sense?

For step 1 and 5, are you doing that via an interface or via scripts?

Are curators doing all of this one at a time or are you doing a bulk load of functional annotations?

Thanks.

mbc32 · 2021-05-07T15:57:02Z

HI @nathandunn
Step 2 is done by the annotator while working on a single gene, so doing it via the user interface would be best.
We still have to implement the functionality for step 1.
Step 5 is done in bulk once we release our data.

nathandunn · 2021-05-07T16:28:34Z

If With / From is not provided, should add NOT_PROVIDED:UNKNOWN and same for reference

* fixes #2601 when complete * added reasonable UI * kind of validating * added example * works, but only for the first anntation * fixed compilation errors * fixed empty references and withOrFrom * added * updated * server code working * updated rest doc and added reload * fixed messages * fixed messages * fixed null references * fixed formatting * fixed deletions * removed consol logs * updated REST api * added changelog * fixed calls to clear

nathandunn added the BBSRC label Mar 31, 2021

nathandunn added this to To do in 2.6.4 LTS via automation Mar 31, 2021

nathandunn added a commit that referenced this issue May 7, 2021

fixes #2601 when complete

61c82dc

nathandunn mentioned this issue May 7, 2021

fixes #2601 when complete #2617

Merged

12 tasks

nathandunn closed this as completed in #2617 May 10, 2021

2.6.4 LTS automation moved this from To do to Done May 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow upload and download of JSON for functional annotation #2601

allow upload and download of JSON for functional annotation #2601

nathandunn commented Mar 31, 2021

nathandunn commented Mar 31, 2021

nathandunn commented Apr 12, 2021 •

edited

nathandunn commented Apr 12, 2021

mbc32 commented May 4, 2021

nathandunn commented May 4, 2021

nathandunn commented May 4, 2021

nathandunn commented May 4, 2021

mbc32 commented May 4, 2021

nathandunn commented May 4, 2021 via email

mbc32 commented May 5, 2021

nathandunn commented May 5, 2021

mbc32 commented May 6, 2021

nathandunn commented May 6, 2021

mbc32 commented May 7, 2021

nathandunn commented May 7, 2021 •

edited

allow upload and download of JSON for functional annotation #2601

allow upload and download of JSON for functional annotation #2601

Comments

nathandunn commented Mar 31, 2021

nathandunn commented Mar 31, 2021

nathandunn commented Apr 12, 2021 • edited

nathandunn commented Apr 12, 2021

mbc32 commented May 4, 2021

nathandunn commented May 4, 2021

nathandunn commented May 4, 2021

nathandunn commented May 4, 2021

mbc32 commented May 4, 2021

nathandunn commented May 4, 2021 via email

mbc32 commented May 5, 2021

nathandunn commented May 5, 2021

mbc32 commented May 6, 2021

nathandunn commented May 6, 2021

mbc32 commented May 7, 2021

nathandunn commented May 7, 2021 • edited

nathandunn commented Apr 12, 2021 •

edited

nathandunn commented May 7, 2021 •

edited