# Data Analyse

If you want to do your own analyse of the data on `db.sqlite3` and are going to use Python you can take advantage of some Django code. This Jupyter Notebook will help you to enable the Django code.

## Setup and run

To setup your environment to run this Jupyter notebook you need to install some packages. Our suggestion is to run

~~~
$ python -m pip install -r requirements.txt
$ python -m pip install -r requirements-jupyter.txt
~~~

from your terminal.

To start Jupyter server, run

~~~
$ python manage.py shell_plus --notebook
~~~

## Basic (Django Part)

You can use all power of Django on the notebook. For example, to gain access to the models you can use

In [11]:
import lowfat.models as models

To select all the fellows you can use

In [14]:
fellows = models.Claimant.objects.filter(fellow=True)
fellows

<QuerySet [<Claimant: Black Widow (2016 ✓)>, <Claimant: The Hulk (2016 ✓)>, <Claimant: Green Arrow (2015 ✓)>, <Claimant: Iron Man (2015 ✓)>, <Claimant: Captain America (2014 ✓)>]>

Remember that the `Claimant` table can have entries that aren't fellows and because of it we need to use `.filter(selected=True)`.

## Basic (Pandas Part)

You can use Pandas with Django.

In [15]:
import pandas as pd

pd.DataFrame(list(fellows.values()))

Unnamed: 0,added,affiliation,application_year,attended_collaborations_workshop,attended_inaugural_meeting,bitbucket,career_stage_when_apply,carpentries_instructor,claimantship_grant,collaborator,...,screencast_url,slug,surname,terms_and_conditions_id,twitter,updated,user_id,website,website_feed,work_description
0,2016-07-07 14:59:46.412,College,2016,False,False,,3,False,3000.0,False,...,,black-widow,Widow,2017,BlackWidow,2018-02-06 15:48:04.747,3,http://black-widow.fake/,http://black-widow.fake/feed/,Work.
1,2016-07-07 14:59:46.412,University,2016,False,False,,3,False,3000.0,False,...,,the-hulk,Hulk,2017,TheHulk,2018-02-06 15:26:59.858,2,http://the-hulk.fake/,http://the-hulk.fake/feed/,Work
2,2016-07-07 14:59:46.412,University,2015,False,False,,3,False,3000.0,False,...,,green-arrow,Arrow,2016,GreenArrow,2018-02-06 15:27:13.848,4,http://green-arrow.fake/,http://green-arrow.fake/feed/,Work
3,2016-07-07 14:59:46.412,University,2015,False,False,,3,False,3000.0,False,...,,iron-man,Man,2016,IronMan,2018-02-06 15:27:25.708,5,http://iron-man.fake/,http://iron-man.fake/feed/,Tech
4,2016-07-07 14:59:46.412,College,2014,False,False,,3,False,3000.0,False,...,,captain-america,America,2015,CaptainAmerica,2018-02-06 15:27:39.366,6,http://captain-america.fake/,http://captain-america.fake/feed/,Work


When converting a Django `QuerySet` into a Pandas `DataFrame` you will need to as the previous example because so far Pandas can't process Django `QuerySet`s by default.

## Basic (Tagulous)

We use [Tagulous](http://radiac.net/projects/django-tagulous/) as a tag library.

In [21]:
funds = models.Fund.objects.all()
pd.DataFrame(list(funds.values()))

Unnamed: 0,ad_status,added,additional_info,approved,budget_approved,budget_request_attendance_fees,budget_request_catering,budget_request_others,budget_request_subsistence_cost,budget_request_travel,...,lat,lon,mandatory,notes_from_admin,required_blog_posts,start_date,status,title,updated,url
0,V,2016-07-07 14:59:46.412,,NaT,1500.0,0.0,500.0,0.0,0.0,1000.0,...,30.0518,-65.84834,False,,1,2016-09-25,P,9d6816aa - Black Widow,2018-04-13 08:53:41.609261,http://9d6816aa.com
1,V,2016-07-07 14:59:46.412,,2017-05-22 16:14:32.068,500.0,0.0,0.0,0.0,0.0,500.0,...,30.0518,-65.84834,False,,1,2016-08-01,A,9d6816de - Black Widow,2018-04-13 08:53:41.617846,http://9d6816de.com
2,V,2016-07-07 14:59:46.412,,NaT,2000.0,0.0,0.0,0.0,0.0,2000.0,...,-13.13435,-90.89369,False,,1,2016-06-16,F,9d681148 - Captain America,2018-04-13 08:53:41.623118,http://9d681148.com
3,V,2016-07-07 14:59:46.412,,NaT,2500.0,0.0,0.0,0.0,0.0,2500.0,...,3.91522,102.7173,False,,1,2016-05-14,F,9d68144a - Green Arrow,2018-04-13 08:53:41.627944,http://9d68144a.com
4,H,2016-07-13 15:01:25.316,,NaT,0.0,0.0,0.0,0.0,0.0,1500.0,...,15.53126,48.89437,False,,1,2015-07-01,R,9d681d8c - Green Arrow,2018-04-13 08:53:41.633089,http://9d681d8c.com
5,V,2016-07-07 14:59:46.412,,2017-05-22 16:14:32.070,2000.0,0.0,0.0,0.0,1000.0,2000.0,...,-20.63223,-178.32413,False,,1,2015-05-14,A,9d683330 - The Hulk,2018-04-13 08:53:41.638127,http://9d683330.com
6,H,2016-07-13 14:59:44.036,,NaT,0.0,0.0,1000.0,0.0,0.0,0.0,...,-73.47064,-9.87283,False,,1,2014-11-10,F,9d681b66 - Captain America,2018-04-13 08:53:41.643078,http://9d681b66.com
7,V,2016-07-07 14:59:46.412,,NaT,1000.0,0.0,0.0,0.0,0.0,1000.0,...,12.25341,173.44064,False,,1,2014-05-20,F,9d680716 - Iron Man,2018-04-13 08:53:41.650238,http://9d680716.com
8,V,2016-07-07 14:59:46.412,,NaT,1500.0,0.0,0.0,0.0,0.0,1500.0,...,167.75629,-71.04731,False,,1,2014-05-16,F,9d680248 - Iron Man,2018-04-13 08:53:41.655177,http://9d680248.com
9,V,2016-07-13 14:58:08.260,,NaT,500.0,0.0,0.0,0.0,0.0,500.0,...,46.95474,-6.73372,False,,1,2014-02-20,R,9d6818e6 - Captain America,2018-04-13 08:53:41.659972,http://9d6818e6.com


Get a list of all tags:

In [27]:
funds[0].grant.all()

<TagTreeModelQuerySet [<Grant: ssi2/fellowship>]>

You can loop over each tag:

In [29]:
for tag in funds[0].grant.all():
    print(tag.name)

ssi2/fellowship


Filter for a specific tag:

In [28]:
models.Fund.objects.filter(grant="ssi2/fellowship")

<CastTaggedQuerySet [<Fund: 9d6816aa - Black Widow (10)>, <Fund: 9d6816de - Black Widow (6)>, <Fund: 9d681148 - Captain America (3)>, <Fund: 9d68144a - Green Arrow (5)>, <Fund: 9d681d8c - Green Arrow (9)>, <Fund: 9d683330 - The Hulk (2)>, <Fund: 9d680716 - Iron Man (4)>, <Fund: 9d680248 - Iron Man (1)>]>

You can query for part of the name of the tag:

In [32]:
models.Fund.objects.filter(grant__name__contains="fellowship")

<CastTaggedQuerySet [<Fund: 9d6816aa - Black Widow (10)>, <Fund: 9d6816de - Black Widow (6)>, <Fund: 9d681148 - Captain America (3)>, <Fund: 9d68144a - Green Arrow (5)>, <Fund: 9d681d8c - Green Arrow (9)>, <Fund: 9d683330 - The Hulk (2)>, <Fund: 9d681b66 - Captain America (8)>, <Fund: 9d680716 - Iron Man (4)>, <Fund: 9d680248 - Iron Man (1)>, <Fund: 9d6818e6 - Captain America (7)>]>

In [33]:
for fund in models.Fund.objects.filter(grant__name__contains="fellowship"):
    print("{} - {}".format(fund, fund.grant.all()))

9d6816aa - Black Widow (10) - ssi2/fellowship
9d6816de - Black Widow (6) - ssi2/fellowship
9d681148 - Captain America (3) - ssi2/fellowship
9d68144a - Green Arrow (5) - ssi2/fellowship
9d681d8c - Green Arrow (9) - ssi2/fellowship
9d683330 - The Hulk (2) - ssi2/fellowship
9d681b66 - Captain America (8) - ssi1/fellowship
9d680716 - Iron Man (4) - ssi2/fellowship
9d680248 - Iron Man (1) - ssi2/fellowship
9d6818e6 - Captain America (7) - ssi1/fellowship
