# Intermine-Python: Tutorial 3: More about Constraints

In the previous tutorial, we learnt about adding constraints to our query so that we could filter the results. In this tutorial we will take a look at some more contraints and the different types of constraints. 

In [4]:
#Initial setup
from intermine.webservice import Service

In [5]:
service = Service("www.flymine.org/flymine/service")
query=service.new_query("Gene") 

##### Unary Constraint

The first type of constraint that we will look at is a Unary Constraint. A Unary Constraint is one that does not take any value but can be used to check if a particular attirbute is absent or present. The Unary constraints are `IS Null` and `IS NOT Null`. We can look at a small example. 

In [3]:
# Syntax: query.add_constraint(field_name,"IS NOT NULL")
# Try constraining the primaryIndentifier field to be not null.


<UnaryConstraint: Gene.primaryIdentifier IS NOT NULL>

In [4]:
for row in query.rows(size=10):
    print(row)

Gene: briefDescription=None cytoLocation='-' description=None id=1000415 length=12653 name='zydeco' primaryIdentifier='FBgn0265767' score=None scoreType=None secondaryIdentifier='CG2893' symbol='zyd'
Gene: briefDescription=None cytoLocation='-' description=None id=1004698 length=1951 name=None primaryIdentifier='FBgn0039942' score=None scoreType=None secondaryIdentifier='CG17163' symbol='CG17163'
Gene: briefDescription=None cytoLocation='-' description=None id=1005938 length=12892 name='Rho GTPase activating protein at 1A' primaryIdentifier='FBgn0025836' score=None scoreType=None secondaryIdentifier='CG40494' symbol='RhoGAP1A'
Gene: briefDescription=None cytoLocation='-' description=None id=1007519 length=21475 name='verthandi' primaryIdentifier='FBgn0260987' score=None scoreType=None secondaryIdentifier='CG17436' symbol='vtd'
Gene: briefDescription=None cytoLocation='-' description=None id=1015398 length=14286 name='Maf1' primaryIdentifier='FBgn0267861' score=None scoreType=None secon

##### Binary Constraint

The next type of constraint is a Binary Constraint. This refers to constraints that take a value. Most of the constraints that we looked at in the second tutorial were binary constraints. Binary constraints are the largest group of constraints. The operators are `=`,`<=`,`>=`,`<`,`>`,`!=`

In [6]:
# Syntax is query.add_constraint(field_name,operator,value)
# Try constraining your Gene query to a length that is greater than or equal to 12000


<BinaryConstraint: Gene.length >= 12000>

In [6]:
for row in query.rows(size=10):
    print(row)

Gene: briefDescription=None cytoLocation='-' description=None id=1000415 length=12653 name='zydeco' primaryIdentifier='FBgn0265767' score=None scoreType=None secondaryIdentifier='CG2893' symbol='zyd'
Gene: briefDescription=None cytoLocation='-' description=None id=1005938 length=12892 name='Rho GTPase activating protein at 1A' primaryIdentifier='FBgn0025836' score=None scoreType=None secondaryIdentifier='CG40494' symbol='RhoGAP1A'
Gene: briefDescription=None cytoLocation='-' description=None id=1007519 length=21475 name='verthandi' primaryIdentifier='FBgn0260987' score=None scoreType=None secondaryIdentifier='CG17436' symbol='vtd'
Gene: briefDescription=None cytoLocation='-' description=None id=1015398 length=14286 name='Maf1' primaryIdentifier='FBgn0267861' score=None scoreType=None secondaryIdentifier='CG40196' symbol='Maf1'
Gene: briefDescription=None cytoLocation='-' description=None id=1018843 length=12844 name=None primaryIdentifier='FBgn0039941' score=None scoreType=None seconda

The above constraint is an example of a binary constraint. 

##### Ternary Constraint

We will now look at Ternary constraints. A ternary constraint is a type of constraint which has one required value and one optional value. Currently, intermine supports only one such type of operator: `LOOKUP`. The lookup operator searches through all the fields in a particular class for the value specified by the user. In the example given below, it will search through the entire gene class to find if any of the fields has an occurence of "zen". The advantage of this is that you do not need to remember if zen is a symbol or a name or a primaryIdentifier. However, this may lead to ambiguous results and so you can use the optional `extra_value` parameter to limit the search to the type of object (for example, organism in genes). 

In [7]:
query2=service.new_query()

In [8]:
# The syntax for the ternary LOOKUP constraint is is follows
# query.add_constraint("Classname", "LOOKUP", "thing_to_search_for", extra_value="extra_here")
# Try looking up a fly gene identifier of your choosing - e.g. perhaps "ZEN" or "FBgn0000055". 
# The extra value for a fly gene would be "D. melanogaster"



<TernaryConstraint: Gene LOOKUP ADH IN D. melanogaster>

In [9]:
for row in query2.rows():
    print(row)

Gene: briefDescription=None cytoLocation=u'35B3-35B3' description=None id=1038479 length=3351 name=u'Alcohol dehydrogenase' primaryIdentifier=u'FBgn0000055' score=None scoreType=None secondaryIdentifier=u'CG3481' symbol=u'Adh'


##### Multi-Value Constraint

The next constraint type that we will look at is Multi-Value constraints. This allows the constraint to take multiple values. The two operators that are allowed are ONE OF and NONE OF. 

In [10]:
query3=service.new_query("Gene")

In [11]:
# syntax here is query.add_constraint("field_name","NONE OF",['value_one','value_two'])



<MultiConstraint: Gene.symbol NONE OF ['zen', 'eve']>

In [12]:
for row in query3.rows(size=10):
    print(row)

Gene: briefDescription=None cytoLocation='-' description=None id=1000415 length=12653 name='zydeco' primaryIdentifier='FBgn0265767' score=None scoreType=None secondaryIdentifier='CG2893' symbol='zyd'
Gene: briefDescription=None cytoLocation='-' description=None id=1004698 length=1951 name=None primaryIdentifier='FBgn0039942' score=None scoreType=None secondaryIdentifier='CG17163' symbol='CG17163'
Gene: briefDescription=None cytoLocation='-' description=None id=1005938 length=12892 name='Rho GTPase activating protein at 1A' primaryIdentifier='FBgn0025836' score=None scoreType=None secondaryIdentifier='CG40494' symbol='RhoGAP1A'
Gene: briefDescription=None cytoLocation='-' description=None id=1007519 length=21475 name='verthandi' primaryIdentifier='FBgn0260987' score=None scoreType=None secondaryIdentifier='CG17436' symbol='vtd'
Gene: briefDescription=None cytoLocation='-' description=None id=1015398 length=14286 name='Maf1' primaryIdentifier='FBgn0267861' score=None scoreType=None secon

##### List Constraint

List Constraints: List constraints allow users to create a named list of objects and then use the operators `IN` and `NOT IN` to use those named lists in queries. An example for the same is below. The path in such a query must always be a Class (for example - Gene is a valid path). The available lists in intermine can be found at: http://www.flymine.org/flymine/bag.do?subtab=view .

In [10]:
query4=service.new_query()

In [11]:
# Syntax: query.add_constraint(class_name,"IN",list_name)
# Try selecting a Gene that is in the list "PL FlyAtlas_brain_top"



<ListConstraint: Gene IN PL FlyAtlas_brain_top>

In [12]:
for row in query4.rows(size=10):
    print(row)

Gene: briefDescription=None cytoLocation=u'10A3-10A3' description=None id=1109635 length=2075 name=None primaryIdentifier=u'FBgn0030259' score=None scoreType=None secondaryIdentifier=u'CG1545' symbol=u'CG1545'
Gene: briefDescription=None cytoLocation=u'11D8-11D8' description=None id=1040181 length=90456 name=u'radish' primaryIdentifier=u'FBgn0265597' score=None scoreType=None secondaryIdentifier=u'CG44424' symbol=u'rad'
Gene: briefDescription=None cytoLocation=u'14A1-14A1' description=None id=1041556 length=26224 name=u'mind-meld' primaryIdentifier=u'FBgn0259110' score=None scoreType=None secondaryIdentifier=u'CG42252' symbol=u'mmd'
Gene: briefDescription=None cytoLocation=u'16F3-16F5' description=None id=1061141 length=138941 name=u'Shaker' primaryIdentifier=u'FBgn0003380' score=None scoreType=None secondaryIdentifier=u'CG12348' symbol=u'Sh'
Gene: briefDescription=None cytoLocation=u'18C2-18C3' description=None id=1078562 length=21373 name=u'nicotinic Acetylcholine Receptor alpha7' pr

##### Sub-Class Constraints

The intermine database is a hierarchical database. Sub-class constraints allow you to specify a sub-class of a class to constrain a path to. This basically allows us to constrain our results to only those items of the sub class. The example below is an example of a sub-class constraint. 

In [19]:
query5=service.new_query("Gene")

In [20]:
# Syntax is query5.add_constraint(sub_class, parent_class)
# Try setting the sub_class to "ontologyAnnotations" and the parent to "GOAnnotation"



<SubClassConstraint: Gene.ontologyAnnotations ISA GOAnnotation>

In [21]:
for row in query5.rows(size=10):
    print(row)

Gene: briefDescription=None cytoLocation='-' description=None id=1000415 length=12653 name='zydeco' primaryIdentifier='FBgn0265767' score=None scoreType=None secondaryIdentifier='CG2893' symbol='zyd'
Gene: briefDescription=None cytoLocation='-' description=None id=1005938 length=12892 name='Rho GTPase activating protein at 1A' primaryIdentifier='FBgn0025836' score=None scoreType=None secondaryIdentifier='CG40494' symbol='RhoGAP1A'
Gene: briefDescription=None cytoLocation='-' description=None id=1007519 length=21475 name='verthandi' primaryIdentifier='FBgn0260987' score=None scoreType=None secondaryIdentifier='CG17436' symbol='vtd'
Gene: briefDescription=None cytoLocation='-' description=None id=1015398 length=14286 name='Maf1' primaryIdentifier='FBgn0267861' score=None scoreType=None secondaryIdentifier='CG40196' symbol='Maf1'
Gene: briefDescription=None cytoLocation='-' description=None id=1018843 length=12844 name=None primaryIdentifier='FBgn0039941' score=None scoreType=None seconda

Unlike most constraints, Sub-class constraints do not have an operator that is specified as a parameter to a constraint. 

This tutorial summed up some of the important constraint types. In the next tutorial we will look at some of the other features of a query. 