In this repository, you will find JSON representing a <i>subset</i> of the data for the <a href="http://modeldb.yale.edu">ModelDB</a> repository of computational neuroscience models.

<h1>Getting started</h1>

Begin by cloning this repository. Create a private repository on github and push your local copy to there.<br/><br/>Connect to MongoDB and create a database for this assignment.

In [47]:
from pymongo import MongoClient
client = MongoClient()
client.drop_database('mydb')
mydb = client['mydb']

Using the <tt>json</tt> module and Python file operations, load the data from <tt>modelcollection.json</tt> and <tt>papercollection.json</tt> into Python.

In [48]:
import json
import requests

with open ('modelcollection.json') as models:
    models= json.load(models)
with open ('papercollection.json') as papers:
    papers= json.load(papers)

Put the loaded data into two collections in your database. I recommend calling them <tt>models</tt> and <tt>papers</tt>.

In [49]:
modelcollection = mydb.modelcollection
modelcollection.insert_many(models)

<pymongo.results.InsertManyResult at 0x10920b4c8>

<h1>Explore the database</h1>

Use MongoDB to answer the following questions. Run your code in the spaces provided.

<b>Q: How many models are there?</b>

In [50]:
modelcollection.count()

1114

<b>What are the field names (keys) for the model entry with <tt>_id</tt> = 87284?</b>

In [51]:
modelcollection.find_one({'_id': 87284})

{'_id': 87284,
 'brainregions': [],
 'celltypes': ['Hippocampus CA1 pyramidal cell'],
 'channels': ['I Na,t',
  'I L high threshold',
  'I N',
  'I T low threshold',
  'I A',
  'I K',
  'I h'],
 'genes': [],
 'modelconcepts': ['Dendritic Action Potentials',
  'Active Dendrites',
  'Detailed Neuronal Models',
  'Pathophysiology',
  'Aging/Alzheimer`s'],
 'modeltype': ['Neuron or other electrically excitable cell'],
 'receptors': [],
 'references': [126976],
 'simenvironment': ['NEURON'],
 'text': 'The model simulations provide evidence oblique dendrites in CA1 pyramidal neurons are susceptible to hyper-excitability by amyloid beta block of the transient K+ channel, IA.  See paper for details.',
 'title': 'Amyloid beta (IA block) effects on a model CA1 pyramidal cell (Morse et al. 2010)',
 'transmitters': []}

Note: this data is not completely denormalized: references in both collections are given in terms of the <tt>_id</tt> field of the paper collection.<br/><br/><b>How many distinct cell types are in the models collection?</b>

In [72]:
len(modelcollection.find().distinct('celltypes'))


188

<b>Find the list of model ids for models that contain a Hippocampus CA3 pyramidal cell.</b>

In [115]:
for item in (modelcollection.find({'celltypes':'Hippocampus CA3 pyramidal cell'})):print (item ['_id'])

101629
114337
118098
120907
126814
129067
135902
135903
137259
137505
138421
139421
142104
143148
146499
147756
147867
148035
150288
151282
168314
168874
181967
184139
185512
186768
189088
20007
3263
35358
7907
84606
87216
87762
98003


<b>What other cells appear in models with a Hippocampus CA3 pyramidal cell? Sort them in alphabetical order. How many such cells are there?</b>

In [148]:
for item in (modelcollection.find({'celltypes':'Hippocampus CA3 pyramidal cell'})):
    print (item['celltypes'])

['Hippocampus CA3 pyramidal cell']
['Hippocampus CA3 pyramidal cell']
['Hippocampus CA3 pyramidal cell']
['Hippocampus CA3 pyramidal cell']
['Hippocampus CA3 pyramidal cell']
['Hippocampus CA1 pyramidal cell', 'Hippocampus CA3 pyramidal cell', 'Entorhinal cortex stellate cell']
['Hippocampus CA1 pyramidal cell', 'Hippocampus CA3 pyramidal cell', 'Hippocampus CA1 interneuron oriens alveus', 'Hippocampus CA1 basket cell']
['Hippocampus CA1 pyramidal cell', 'Hippocampus CA3 pyramidal cell', 'Hippocampus CA1 interneuron oriens alveus', 'Hippocampus CA1 basket cell']
['Hippocampus CA3 pyramidal cell']
['Hippocampus CA1 pyramidal cell', 'Hippocampus CA3 pyramidal cell']
['Hippocampus CA1 pyramidal cell', 'Hippocampus CA3 pyramidal cell', 'Hippocampus CA1 interneuron oriens alveus', 'Neocortex fast spiking (FS) interneuron']
['Hippocampus CA3 pyramidal cell', 'Hippocampus CA3 basket cell', 'Hippocampus CA3 stratum oriens lacunosum-moleculare interneuron']
['Hippocampus CA3 pyramidal cell', 'H

<h1>Use aggregation</h1>

How many models are there for each cell type? Display the results in a formatted table, sorted from most commonly appearing cell type to least commonly appearing.

In [152]:
list(modelcollection.aggregate([
            {
               '$unwind': '$celltypes' 
            }, 
            {
                '$group': {
                    '_id': '$celltypes',
                    'count': {'$sum':1}
                }
            },
            {
                '$sort': {'count': -1}
            }
            
        ]))

[{'_id': 'Neocortex layer 5-6 pyramidal cell', 'count': 108},
 {'_id': 'Hippocampus CA1 pyramidal cell', 'count': 104},
 {'_id': 'Neocortex layer 2-3 pyramidal cell', 'count': 60},
 {'_id': 'Hippocampus CA3 pyramidal cell', 'count': 35},
 {'_id': 'Olfactory bulb main mitral cell', 'count': 30},
 {'_id': 'Neocortex fast spiking (FS) interneuron', 'count': 30},
 {'_id': 'Hodgkin-Huxley neuron', 'count': 29},
 {'_id': 'Thalamus geniculate nucleus (lateral) principal neuron',
  'count': 26},
 {'_id': 'Abstract integrate-and-fire leaky neuron', 'count': 25},
 {'_id': 'Dentate gyrus granule cell', 'count': 24},
 {'_id': 'Cerebellum purkinje cell', 'count': 24},
 {'_id': 'Neocortex spiking regular (RS) neuron', 'count': 22},
 {'_id': 'Neostriatum spiny direct pathway neuron', 'count': 22},
 {'_id': 'Neocortex spiking low threshold (LTS) neuron', 'count': 21},
 {'_id': 'Neocortex layer 4 pyramidal cell', 'count': 20},
 {'_id': 'Globus pallidus neuron', 'count': 19},
 {'_id': 'Olfactory bulb ma

Find the model titles (not paper titles) for models that (1) involve a Hippocampus CA3 pyramidal cell, and (2) have an associated reference where one of the authors is "Migliore M".

In [153]:
list(modelcollection.aggregate([
            {
               '$match': 
                {
                    '$celltypes': 'Hippocampus CA3 pyramidal cell',
                    '$authors': 'Migliore M'
                }
                
            }, 
        ]))

OperationFailure: unknown top level operator: $authors

Find all the authors who were on a paper associated with a model that involved a Hippocampus CA3 pyramidal cell. Sort them in alphabetical order; give this list and state its length.

<h1>Modify the database</h1>

Rename the Hippocampus CA1 pyramidal cell to be the Hippocampus CA1 pyramidal neuron. (Note: here we're using CA1 instead of CA3.) Make sure that this is consistent across all documents in the models collection.

In [144]:
modelcollection.update_many({'celtypes': 'Hippocampus CA1 pyramidal cell'},
                  {'$set': {'celltypes.$': 'Hippocampus CA1 pyramidal neuron'}}
)

<pymongo.results.UpdateResult at 0x109637b88>

Add a new entry (make up the data, but keep it appropriate) to the models collection. Associate it with two references, one that already exists and one that you also add to the papers collection.

In [147]:
modelcollection.insert_one({
        'name': 'Hippocampus CA1 pyramidal cell',
        'channels': ['I Na,t', 'I Na,p', 'I Potassium', 'I A', 'I K', 'I M',
                     'I L high threshold', 'I N', 'I T low threshold', 'I p,q',
                     'I K,Ca', 'I h'],
        'transmitters': ['NO', 'Glutamate'],
        'average dendrite length': 4586,
        'authors': ['Bartos M', 'Johnny McFake author III']
              })

<pymongo.results.InsertOneResult at 0x109637d38>