access of molecule_form is not fileterd properly by parent identifier #113

Closed
piotr-gawron opened this Issue Oct 24, 2016 · 8 comments

Projects

None yet

2 participants

@piotr-gawron
piotr-gawron commented Oct 24, 2016 edited

I'm using queries like: https://www.ebi.ac.uk/chembl/api/data/molecule_form?parent=CHEMBL660 to obtain information about hierarchical structure of chembl compunds. It was working for some time, but in the past few days/weeks it has changed. Filtering based on the parent field doesn't work anymore. When I try to access it I get some random compounds with no relation to the compound in the query.

Whereas some time ago it looked like:

<response>
  <molecule_forms>
    <molecule_form>
      <molecule_chembl_id>CHEMBL1445834</molecule_chembl_id>
      <parent>False</parent>
    </molecule_form>
    <molecule_form>
      <molecule_chembl_id>CHEMBL1569</molecule_chembl_id>
      <parent>False</parent>
    </molecule_form>
    <molecule_form>
      <molecule_chembl_id>CHEMBL465617</molecule_chembl_id>
      <parent>False</parent>
    </molecule_form>
    <molecule_form>
      <molecule_chembl_id>CHEMBL660</molecule_chembl_id>
      <parent>True</parent>
    </molecule_form>
  </molecule_forms>
  <page_meta>
    <limit>20</limit>
    <next/><offset/>
    <previous/>
    <total_count>4</total_count>
  </page_meta>
</response>
@mnowotka mnowotka added the bug label Oct 24, 2016
@mnowotka mnowotka self-assigned this Oct 24, 2016
@mnowotka
Member

Interesting! Let me have a look...

@mnowotka
Member

OK, so I changed this endpoint as a part of release. Now, for a given compound, if you want to explore it hierarchy you should call: https://www.ebi.ac.uk/chembl/api/data/molecule_form/CHEMBLID.json, so in your example: https://www.ebi.ac.uk/chembl/api/data/molecule_form/CHEMBL660.json

This will return this document:

{

"molecule_forms": [
    {
        "is_parent": "False",
        "molecule_chembl_id": "CHEMBL465617",
        "parent_chembl_id": "CHEMBL660"
    },
    {
        "is_parent": "True",
        "molecule_chembl_id": "CHEMBL660",
        "parent_chembl_id": "CHEMBL660"
    },
    {
        "is_parent": "False",
        "molecule_chembl_id": "CHEMBL1445834",
        "parent_chembl_id": "CHEMBL660"
    },
    {
        "is_parent": "False",
        "molecule_chembl_id": "CHEMBL1569",
        "parent_chembl_id": "CHEMBL660"
    }
],
"page_meta": {
    "limit": 20,
    "next": null,
    "offset": 0,
    "previous": null,
    "total_count": 4
}

}

Does it make sense?

@piotr-gawron

Sure, I will update my code.
But maybe you should consider disabling parent parameter? Error message is better than random output ;-)

@mnowotka
Member

By design we silently ignore parameters that doesn't exists within the resource. So https://www.ebi.ac.uk/chembl/api/data/molecule_form.json?foo=bla is equivalent to https://www.ebi.ac.uk/chembl/api/data/molecule_form.json? but https://www.ebi.ac.uk/chembl/api/data/molecule_form.json?parent_chembl_id=CHEMBL660 will return a message informing that you are not allowed to apply filters on this field. This is because of the nature of REST API, it's often used in browsers and it may happen that some invalid parameters were just copied when coming from some other URL. Since 'parent' does not exist (there is 'is_parent' flag) it was ignored.

@mnowotka
Member

BTW: the output is not random at all. Since https://www.ebi.ac.uk/chembl/api/data/molecule_form?parent=CHEMBL660 is equivalent with https://www.ebi.ac.uk/chembl/api/data/molecule_form, it simply shows all the parent-child relations stored in ChEMBL (actually the first 20 of them because of the pagination).

@piotr-gawron

Well,
This parameter was valid in the previous version, so after removing it from API, old queries should inform user that it is not valid anymore. That's at least my impression.

From time to time you return error messages: https://www.ebi.ac.uk/chembl/api/data/molecule_form?parent_chembl_id=CHEMBL660

Btw. you still have "parent" parameter in your documentation for this method: https://www.ebi.ac.uk/chembl/api/data/molecule_form/schema

Anyway, thanks for help.
I really appreciate it

@mnowotka
Member

Yes, I can certainly correct the documentation.

@mnowotka
Member

OK, so there was a bug and your issue helped me with identifying it. The bug is here: https://github.com/chembl/chembl_webservices_2/blob/master/chembl_webservices/resources/molecule_forms.py#L42 where 'filtering' part still mentions old field name. I released the fix which makes:

@mnowotka mnowotka closed this Oct 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment