# Advanced Querying Mongo

**⚠️ IMPORTANT: Limit your print to avoid infinite scrolling. Otherwise your
code will be lost between print lines. If working with lists do:**

```python
list(collection.find(query))[:5] #or a reasonably low number
```

Importing libraries and setting up connection

In [7]:
from pymongo import MongoClient
import pandas as pd
import time


client = MongoClient("localhost:27017")
db = client["ironhack"]
c = db.get_collection("companies")

### 1. All the companies whose name match 'Babelgum'. Retrieve only their `name` field.

In [12]:
# Your Code

projection = {"_id":0, "name": 1}

list(c.find({"name": "Babelgum"}, projection))


[{'name': 'Babelgum'}]

### 2. All the companies that have more than 5000 employees. Limit the search to 20 companies and sort them by **number of employees**.

In [24]:
# Your Code

projection = {"_id":0, "name": 1, "number_of_employees":1}

list(c.find({"number_of_employees": {"$gt":5000}}, projection).sort("number_of_employees", -1).limit(20))


[{'name': 'Siemens', 'number_of_employees': 405000},
 {'name': 'IBM', 'number_of_employees': 388000},
 {'name': 'Toyota', 'number_of_employees': 320000},
 {'name': 'PayPal', 'number_of_employees': 300000},
 {'name': 'Nippon Telegraph and Telephone Corporation',
  'number_of_employees': 227000},
 {'name': 'Samsung Electronics', 'number_of_employees': 221726},
 {'name': 'Accenture', 'number_of_employees': 205000},
 {'name': 'Tata Consultancy Services', 'number_of_employees': 200300},
 {'name': 'Flextronics International', 'number_of_employees': 200000},
 {'name': 'Safeway', 'number_of_employees': 186000},
 {'name': 'Sony', 'number_of_employees': 180500},
 {'name': 'LG', 'number_of_employees': 177000},
 {'name': 'Ford', 'number_of_employees': 171000},
 {'name': 'Boeing', 'number_of_employees': 160000},
 {'name': 'Digital Equipment Corporation', 'number_of_employees': 140000},
 {'name': 'Nokia', 'number_of_employees': 125000},
 {'name': 'MItsubishi Electric', 'number_of_employees': 107000}

### 3. All the companies founded between 2000 and 2005, both years included. Retrieve only the `name` and `founded_year` fields.

In [31]:
# Your Code

filter1= {"founded_year":{"$gte":2000}}
filter2= {"founded_year":{"$lte":2005}}

projection = {"_id":0, "name": 1, "founded_year":1}

multiple_conditions = {"$and":[filter1, filter2]}

list(c.find(multiple_conditions, projection))


[{'name': 'Wetpaint', 'founded_year': 2005},
 {'name': 'Zoho', 'founded_year': 2005},
 {'name': 'Digg', 'founded_year': 2004},
 {'name': 'Facebook', 'founded_year': 2004},
 {'name': 'Omnidrive', 'founded_year': 2005},
 {'name': 'StumbleUpon', 'founded_year': 2002},
 {'name': 'Gizmoz', 'founded_year': 2003},
 {'name': 'Helio', 'founded_year': 2005},
 {'name': 'Plaxo', 'founded_year': 2002},
 {'name': 'Technorati', 'founded_year': 2002},
 {'name': 'AddThis', 'founded_year': 2004},
 {'name': 'Veoh', 'founded_year': 2004},
 {'name': 'Jingle Networks', 'founded_year': 2005},
 {'name': 'Meetup', 'founded_year': 2002},
 {'name': 'LifeLock', 'founded_year': 2005},
 {'name': 'Wesabe', 'founded_year': 2005},
 {'name': 'Jangl SMS', 'founded_year': 2005},
 {'name': 'SmugMug', 'founded_year': 2002},
 {'name': 'Jajah', 'founded_year': 2005},
 {'name': 'Skype', 'founded_year': 2003},
 {'name': 'YouTube', 'founded_year': 2005},
 {'name': 'Pando Networks', 'founded_year': 2004},
 {'name': 'Ikan', 'foun

### 4. All the companies that had a Valuation Amount of more than 100.000.000 and have been founded before 2010. Retrieve only the `name` and `ipo` fields.

In [51]:
# Your Code

filter1 = {"raised_amount":{"$gt":10}}

filter2 = {"founded_year":{"$lt":2010}}

projection = {"_id":0, "name": 1, "raised_amount":1}

multiple_conditions = {"$and":[filter1,filter2]}

list(c.find(multiple_conditions, projection))


[]

### 5. All the companies that have less than 1000 employees and have been founded before 2005. Order them by the number of employees and limit the search to 10 companies.

In [55]:
# Your Code

filter1 = {"number_of_employees":{"$lt":1000}}

filter2 = {"founded_year":{"$lt":2005}}

projection = {"_id":0, "name": 1, "raised_amount":1}

multiple_conditions = {"$and":[filter1,filter2]}

list(c.find(multiple_conditions, projection).sort("number_of_employees", -1).limit(10))


[{'name': 'Infinera Corporation'},
 {'name': 'NorthPoint Communications Group'},
 {'name': '888 Holdings'},
 {'name': 'Forrester Research'},
 {'name': 'SonicWALL'},
 {'name': 'Webmetrics'},
 {'name': 'Cornerstone OnDemand'},
 {'name': 'Mozilla'},
 {'name': 'Buongiorno'},
 {'name': 'Yelp'}]

### 6. All the companies that don't include the `partners` field.

In [58]:
# Your Code

not_partners = {"partners": {"$ne":"partners"}}

list(c.find(not_partners).limit(1))


[{'_id': ObjectId('52cdef7c4bab8bd675297d8a'),
  'name': 'Wetpaint',
  'permalink': 'abc2',
  'crunchbase_url': 'http://www.crunchbase.com/company/wetpaint',
  'homepage_url': 'http://wetpaint-inc.com',
  'blog_url': 'http://digitalquarters.net/',
  'blog_feed_url': 'http://digitalquarters.net/feed/',
  'twitter_username': 'BachelrWetpaint',
  'category_code': 'web',
  'number_of_employees': 47,
  'founded_year': 2005,
  'founded_month': 10,
  'founded_day': 17,
  'deadpooled_year': 1,
  'tag_list': 'wiki, seattle, elowitz, media-industry, media-platform, social-distribution-system',
  'alias_list': '',
  'email_address': 'info@wetpaint.com',
  'phone_number': '206.859.6300',
  'description': 'Technology Platform Company',
  'created_at': datetime.datetime(2007, 5, 25, 6, 51, 27),
  'updated_at': 'Sun Dec 08 07:15:44 UTC 2013',
  'overview': '<p>Wetpaint is a technology platform company that uses its proprietary state-of-the-art technology and expertise in social media to build and mon

### 7. All the companies that have a null type of value on the `category_code` field.

In [67]:
# Your Code

not_category = {"category_code": {"$type": "null"}}

projection = {"_id":0, "name": 1, "category_code":1}

list(c.find(not_category,projection).limit(5))

[{'name': 'Collective', 'category_code': None},
 {'name': 'Snimmer', 'category_code': None},
 {'name': 'KoolIM', 'category_code': None},
 {'name': 'Level9 Media', 'category_code': None},
 {'name': 'VidKing', 'category_code': None}]

### 8. All the companies that have at least 100 employees but less than 1000. Retrieve only the `name` and `number of employees` fields.

In [69]:
# Your Code

filter1 =  {"number_of_employee}

filter2 =  {"number_of_employees":{"$lt":1000}}

multiple_conditions = {"$and":[filter1,filter2]}

projection = {"_id":0, "name": 1, "number_of_employees":1}


list(c.find(multiple_conditions,projection).limit(20))


[{'name': 'AdventNet', 'number_of_employees': 600},
 {'name': 'AddThis', 'number_of_employees': 120},
 {'name': 'OpenX', 'number_of_employees': 305},
 {'name': 'LifeLock', 'number_of_employees': 644},
 {'name': 'Jajah', 'number_of_employees': 110},
 {'name': 'Livestream', 'number_of_employees': 120},
 {'name': 'Ustream', 'number_of_employees': 250},
 {'name': 'iContact', 'number_of_employees': 300},
 {'name': 'Yelp', 'number_of_employees': 800},
 {'name': 'Dailymotion', 'number_of_employees': 120},
 {'name': 'RockYou', 'number_of_employees': 106},
 {'name': 'Meebo', 'number_of_employees': 200},
 {'name': 'Eventbrite', 'number_of_employees': 200},
 {'name': 'Box', 'number_of_employees': 950},
 {'name': 'Conduit', 'number_of_employees': 215},
 {'name': 'Redfin', 'number_of_employees': 100},
 {'name': 'oDesk', 'number_of_employees': 120},
 {'name': 'Simply Hired', 'number_of_employees': 100},
 {'name': 'PhotoBox', 'number_of_employees': 600},
 {'name': 'Spreadshirt', 'number_of_employees'

### 9. Order all the companies by their IPO price in a descending order.

In [79]:
# Your Code

#list(c.find({"name": "Babelgum"}, projection))

list(c.find().sort({"raised_amount":-1}).limit(5))

# NO IPO!!!


[{'_id': ObjectId('52cdef7c4bab8bd675297d8d'),
  'name': 'Digg',
  'permalink': 'digg',
  'crunchbase_url': 'http://www.crunchbase.com/company/digg',
  'homepage_url': 'http://www.digg.com',
  'blog_url': 'http://blog.digg.com/',
  'blog_feed_url': 'http://blog.digg.com/?feed=rss2',
  'twitter_username': 'digg',
  'category_code': 'news',
  'number_of_employees': 60,
  'founded_year': 2004,
  'founded_month': 10,
  'founded_day': 11,
  'deadpooled_year': None,
  'deadpooled_month': None,
  'deadpooled_day': None,
  'deadpooled_url': None,
  'tag_list': 'community, social, news, bookmark, digg, technology, design',
  'alias_list': '',
  'email_address': 'feedback@digg.com',
  'phone_number': '(415) 436-9638',
  'description': 'user driven social content website',
  'created_at': 'Fri May 25 20:03:23 UTC 2007',
  'updated_at': 'Tue Nov 05 21:35:47 UTC 2013',
  'overview': '<p>Digg is a user driven social content website. Everything on Digg is user-submitted. After you submit content, oth

### 10. Retrieve the 10 companies with more employees, order by the `number of employees`

In [83]:
# Your Code

list(c.find().sort({"number_of_employees":-1}).limit(10))


[{'_id': ObjectId('52cdef7d4bab8bd67529941a'),
  'name': 'Siemens',
  'permalink': 'siemens',
  'crunchbase_url': 'http://www.crunchbase.com/company/siemens',
  'homepage_url': 'http://www.siemens.com',
  'blog_url': '',
  'blog_feed_url': '',
  'twitter_username': 'Siemens',
  'category_code': 'hardware',
  'number_of_employees': 405000,
  'founded_year': 1847,
  'founded_month': None,
  'founded_day': None,
  'deadpooled_year': None,
  'deadpooled_month': None,
  'deadpooled_day': None,
  'deadpooled_url': None,
  'tag_list': 'automation, building-technologies, drive-technology, energy',
  'alias_list': '',
  'email_address': 'contact@siemens.com',
  'phone_number': '49 89 636 34134',
  'description': 'Electronics and Electrical Engineering',
  'created_at': 'Thu Jul 31 09:29:43 UTC 2008',
  'updated_at': 'Thu Nov 28 20:32:55 UTC 2013',
  'overview': '<p>Siemens AG, an electronics and electrical engineering company, operates in the industry, energy, and healthcare sectors worldwide. 

### 11. All the companies founded on the second semester of the year. Limit your search to 1000 companies.

In [86]:
# Your Code

filter1 = {"founded_month":{"$gte":6}}

projection = {"_id":0, "name": 1}


list(c.find(filter1,projection).limit(1000))


[{'name': 'Wetpaint'},
 {'name': 'Zoho'},
 {'name': 'Digg'},
 {'name': 'Omnidrive'},
 {'name': 'Postini'},
 {'name': 'Geni'},
 {'name': 'Fox Interactive Media'},
 {'name': 'eBay'},
 {'name': 'Joost'},
 {'name': 'Plaxo'},
 {'name': 'Powerset'},
 {'name': 'Technorati'},
 {'name': 'Sparter'},
 {'name': 'Kyte'},
 {'name': 'Thoof'},
 {'name': 'Jingle Networks'},
 {'name': 'LifeLock'},
 {'name': 'Wesabe'},
 {'name': 'SmugMug'},
 {'name': 'Google'},
 {'name': 'Skype'},
 {'name': 'Pando Networks'},
 {'name': 'Ikan'},
 {'name': 'delicious'},
 {'name': 'Topix'},
 {'name': 'Pownce'},
 {'name': 'AllPeers'},
 {'name': 'Wize'},
 {'name': 'AllofMP3'},
 {'name': 'SellABand'},
 {'name': 'iContact'},
 {'name': 'MeeVee'},
 {'name': 'blinkx'},
 {'name': 'Zlio'},
 {'name': 'Yelp'},
 {'name': 'Jaiku'},
 {'name': 'Yapta'},
 {'name': 'Fleck'},
 {'name': 'SideStep'},
 {'name': 'RockYou'},
 {'name': 'Instructables'},
 {'name': 'Netvibes'},
 {'name': 'Swivel'},
 {'name': 'Slide'},
 {'name': 'TripHub'},
 {'name':

### 12. All the companies founded before 2000 that have an acquisition amount of more than 10.000.000

In [93]:
# Your Code

filter1= {"founded_year":{"$lt":2000}}

filter2= {"acquisition.price_amount":{"$gt":10000000}}

multiple_conditions = {"$and":[filter1,filter2]}

list(c.find(multiple_conditions).limit(10))

[{'_id': ObjectId('52cdef7c4bab8bd675297d90'),
  'name': 'Postini',
  'permalink': 'postini',
  'crunchbase_url': 'http://www.crunchbase.com/company/postini',
  'homepage_url': 'http://postini.com',
  'blog_url': '',
  'blog_feed_url': '',
  'twitter_username': None,
  'category_code': 'web',
  'number_of_employees': None,
  'founded_year': 1999,
  'founded_month': 6,
  'founded_day': 2,
  'deadpooled_year': None,
  'deadpooled_month': None,
  'deadpooled_day': None,
  'deadpooled_url': None,
  'tag_list': '',
  'alias_list': None,
  'email_address': '',
  'phone_number': '888.584.3150',
  'description': None,
  'created_at': 'Fri Jun 08 12:19:51 UTC 2007',
  'updated_at': 'Sat Aug 13 18:02:34 UTC 2011',
  'overview': '<p>Postini focuses on two main issues: security and compliance. Postini states that it handles more than 1 billion messages everyday and protects more than 35,000 businesses worldwide.</p>\n\n<p>Postini offers solutions that protect your company from malicious internet a

### 13. All the companies that have been acquired after 2010, order by the acquisition amount, and retrieve only their `name` and `acquisition` field.

In [96]:
# Your Code

filter1= {"acquisition.acquired_year":{"$gt":2010}}

projection = {"_id":0, "name": 1, "acquisition":1}

list(c.find(filter1, projection).sort({"acquisition.price_amount":-1}).limit(2))


[{'name': 'T-Mobile',
  'acquisition': {'price_amount': 39000000000,
   'price_currency_code': 'USD',
   'term_code': None,
   'source_url': 'http://techcrunch.com/2011/03/20/in-the-race-for-more-spectrum-att-is-acquiring-t-mobile-for-39-billion/',
   'source_description': 'In The Race For More Spectrum, AT&T Is Acquiring T-Mobile For $39 Billion',
   'acquired_year': 2011,
   'acquired_month': 3,
   'acquired_day': 20,
   'acquiring_company': {'name': 'AT&T', 'permalink': 'at-t'}}},
 {'name': 'Goodrich Corporation',
  'acquisition': {'price_amount': 18400000000,
   'price_currency_code': 'USD',
   'term_code': None,
   'source_url': 'http://www.masshightech.com/stories/2011/09/19/daily37-UTC-shells-out-184-billion-for-Goodrich.html',
   'source_description': 'UTC shells out $18.4 billion for Goodrich',
   'acquired_year': 2011,
   'acquired_month': 9,
   'acquired_day': 22,
   'acquiring_company': {'name': 'United Technologies',
    'permalink': 'united-technologies'}}}]

### 14. Order the companies by their `founded year`, retrieving only their `name` and `founded year`.

In [100]:
# Your Code

projection = {"_id":0, "name": 1, "founded_year":1}

list(c.find({},projection).sort({"founded_year":1}).limit(5))

[{'name': 'SpinVox', 'founded_year': None},
 {'name': 'Flektor', 'founded_year': None},
 {'name': 'Info', 'founded_year': None},
 {'name': 'Gannett', 'founded_year': None},
 {'name': 'Lala', 'founded_year': None}]

### 15. All the companies that have been founded on the first seven days of the month, including the seventh. Sort them by their `acquisition price` in a descending order. Limit the search to 10 documents.

In [104]:
# Your Code

filter1= {"founded_day":{"$lte":7}}

projection = {"_id":0, "name": 1}

list(c.find(filter1, projection).sort({"acquisition.price_amount":-1}).limit(10))


[{'name': 'Netscape'},
 {'name': 'PayPal'},
 {'name': 'Zappos'},
 {'name': 'Alibaba'},
 {'name': 'Postini'},
 {'name': 'Danger'},
 {'name': 'Clearwell Systems'},
 {'name': 'PrimeSense'},
 {'name': 'Amobee'},
 {'name': 'BlueLithium'}]

### 16. All the companies on the 'web' `category` that have more than 4000 employees. Sort them by the amount of employees in ascending order.

In [105]:
# Your Code

filter1 = {"number_of_employees": {"$gt":4000}}

projection = {"_id":0, "name": 1}

list(c.find(filter1, projection).sort({"number_of_employees":1}).limit(10))


[{'name': 'RF Micro Devices'},
 {'name': 'Novell'},
 {'name': 'Trend Micro'},
 {'name': 'Nykredit Realkredit'},
 {'name': 'Expedia'},
 {'name': 'Sabre Travel Network'},
 {'name': 'NRG Energy'},
 {'name': 'Gottschalks'},
 {'name': 'LinkedIn'},
 {'name': 'KLA Tencor'}]

### 17. All the companies whose acquisition amount is more than 10.000.000, and currency is 'EUR'.

In [112]:
# Your Code

filter1= {"acquisition.price_amount":{"$gt":10000000}}

filter2= {"raised_currency_code":"EUR"}

projection = {"_id":0, "name": 1}

multiple_conditions = {"$and":[filter1,filter2]}

list(c.find(multiple_conditions, projection).limit(10))


[]

### 18. All the companies that have been acquired on the first trimester of the year. Limit the search to 10 companies, and retrieve only their `name` and `acquisition` fields.

In [117]:
# Your Code

filter1 = {"acquired_month":{"$lte":3}}

projection = {"_id":0, "name": 1, "acquisition":1}

list(c.find(filter1, projection).limit(10))


[]

# Bonus
### 19. All the companies that have been founded between 2000 and 2010, but have not been acquired before 2011.

In [None]:
# Your Code

### 20. All the companies that have been 'deadpooled' after the third year.

In [None]:
# Your Code

**⚠️ Did you do this?**

```python
list(collection.find(query))[:5] #or a reasonably low number
```