# Clickstream Project

Your client `Kirana Store` is an E-commerce company. The company wants to focus on targeting the right customers  with the right products to increase overall revenue and conversion rate.

`Kirana Store` has provided you with the clickstream data on their website and wants you to tell them the answer to their queries. This will help them improve their understanding about their customers so that they can create better product personalization, marketing campaigns, advertisements, etc.

The data contains the following fields :-

- `webClientID` - Unique ID of browser for every system. (If a visitor is using multiple browsers on a system like Chrome, Safari, then there would be a different web clientid for each browser). 

- `VisitDateTime` - Date and time of visit.

- `ProductID` - Unique ID of product browsed/ clicked by the visitor.

- `Activity` - Type of activity can be browsing (`pageload`) or clicking (`click`) a product

- `device` - Information about the device used by visitor to visit the website
> - `Browser` - Browser used by visitor
> - `OS` - OS used by the visitor

- `user` - Information about registered user or users who have already signed up
> - `UserID` - Unique ID of the user
> - `City` - City of the user
> - `Country` - Country of the user

----
## Connecting to MongoDB

----

In [1]:
# Importing the required libraries
import pymongo
import pandas as pd
from datetime import datetime
import pprint as pp

# Does not allow pprint to sort the fields
pp.sorted = lambda x, key=None: x

In [2]:
# Connect to local MongoDB server
client = pymongo.MongoClient("mongodb://localhost:27017/")

In [3]:
# # Restore database
# !mongorestore /home/avadmin/Desktop/Mongo/Content/Project/Project_data/project

In [4]:
# Choose the database
db = client['project']

In [5]:
# Sample document
pp.pprint(
    db.clicks.find_one()
)

{'_id': ObjectId('60df1029ad74d9467c91a932'),
 'webClientID': 'WI100000244987',
 'VisitDateTime': datetime.datetime(2018, 5, 25, 4, 51, 14, 179000),
 'ProductID': 'Pr100037',
 'Activity': 'click',
 'device': {'Browser': 'Firefox', 'OS': 'Windows'},
 'user': {'City': 'Colombo', 'Country': 'Sri Lanka'}}


---
## Overiew of the data

----

Number of documents in the collection

In [6]:
db.clicks.count_documents({})

6100000

---
Minimum date for which the records are present

In [7]:
from datetime import datetime

In [8]:
cur = db.clicks.find({}, {'VisitDateTime': 1,"_id":0})\
               .sort([('VisitDateTime', pymongo.ASCENDING)])\
               .limit(1)

for doc in cur:
    pp.pprint(doc)

{'VisitDateTime': datetime.datetime(2018, 5, 7, 0, 0, 1, 190000)}


----
Maximum date  for which the records are present

In [9]:
cur = db.clicks.find({}, {'VisitDateTime': 1, '_id': 0})\
               .sort([('VisitDateTime', pymongo.DESCENDING)])\
               .limit(1)

for doc in cur:
    pp.pprint(doc)

{'VisitDateTime': datetime.datetime(2018, 5, 27, 23, 59, 59, 576000)}


----
Number of documents that have `user.UserID` field or number of users who have signed up on the website.

In [10]:
db.clicks.count_documents({'user.UserID': {'$exists': True}})

602293

---
Unique signed up users in the complete data

In [11]:
len(db.clicks.find({'user.UserID': {'$exists': True}}).distinct('user.UserID'))

34050

---
Unique countries and cities

In [12]:
cur = db.clicks.aggregate([
            {
                '$match': {
                            'user.Country': {'$exists':True},
                            'user.City': {'$exists': True}
                        }
            },
            {
                '$group': {
                                '_id': {
                                            'Country': '$user.Country', 
                                            'City': '$user.City'
                                        }
                        }
            },
            {
                '$sort': {'_id.Country': 1}
            }
        ])

for doc in cur:
    pp.pprint(doc)

{'_id': {'Country': 'Afghanistan', 'City': 'Kabul'}}
{'_id': {'Country': 'Albania', 'City': 'Elbasan'}}
{'_id': {'Country': 'Albania', 'City': 'Fier'}}
{'_id': {'Country': 'Albania', 'City': 'Shkoder'}}
{'_id': {'Country': 'Albania', 'City': 'Yzberish'}}
{'_id': {'Country': 'Albania', 'City': 'Lezhë'}}
{'_id': {'Country': 'Albania', 'City': 'Korçë'}}
{'_id': {'Country': 'Albania', 'City': 'Kosove'}}
{'_id': {'Country': 'Albania', 'City': 'Durrës'}}
{'_id': {'Country': 'Albania', 'City': 'Tirana'}}
{'_id': {'Country': 'Algeria', 'City': "'Ain Benian"}}
{'_id': {'Country': 'Algeria', 'City': 'Theniet el Had'}}
{'_id': {'Country': 'Algeria', 'City': 'Frenda'}}
{'_id': {'Country': 'Algeria', 'City': 'Mila'}}
{'_id': {'Country': 'Algeria', 'City': 'Skikda'}}
{'_id': {'Country': 'Algeria', 'City': 'El Hadjira'}}
{'_id': {'Country': 'Algeria', 'City': 'Sig'}}
{'_id': {'Country': 'Algeria', 'City': 'Oum el Bouaghi'}}
{'_id': {'Country': 'Algeria', 'City': 'Tidjelabine'}}
{'_id': {'Country': 'A

{'_id': {'Country': 'Australia', 'City': 'Macquarie Fields'}}
{'_id': {'Country': 'Australia', 'City': 'Carlingford'}}
{'_id': {'Country': 'Australia', 'City': 'Tanunda'}}
{'_id': {'Country': 'Australia', 'City': 'Oakleigh'}}
{'_id': {'Country': 'Australia', 'City': 'Cranebrook'}}
{'_id': {'Country': 'Australia', 'City': 'Gulgong'}}
{'_id': {'Country': 'Australia', 'City': 'Moorooka'}}
{'_id': {'Country': 'Australia', 'City': 'Saint Lucia'}}
{'_id': {'Country': 'Australia', 'City': 'Geraldton'}}
{'_id': {'Country': 'Australia', 'City': 'Armidale'}}
{'_id': {'Country': 'Australia', 'City': 'Auburn'}}
{'_id': {'Country': 'Australia', 'City': 'Middle Swan'}}
{'_id': {'Country': 'Australia', 'City': 'Engadine'}}
{'_id': {'Country': 'Australia', 'City': 'Lawnton'}}
{'_id': {'Country': 'Australia', 'City': 'Caulfield'}}
{'_id': {'Country': 'Australia', 'City': 'Cammeray'}}
{'_id': {'Country': 'Australia', 'City': 'Upper Coomera'}}
{'_id': {'Country': 'Australia', 'City': 'St Kilda'}}
{'_id':

{'_id': {'Country': 'Brazil', 'City': 'Ibiapina'}}
{'_id': {'Country': 'Brazil', 'City': 'Pinhalzinho'}}
{'_id': {'Country': 'Brazil', 'City': 'Itu'}}
{'_id': {'Country': 'Brazil', 'City': 'Nova Timboteua'}}
{'_id': {'Country': 'Brazil', 'City': 'Piquet Carneiro'}}
{'_id': {'Country': 'Brazil', 'City': 'Jundiaí'}}
{'_id': {'Country': 'Brazil', 'City': 'Andradina'}}
{'_id': {'Country': 'Brazil', 'City': 'Sirinhaem'}}
{'_id': {'Country': 'Brazil', 'City': 'Itapecerica da Serra'}}
{'_id': {'Country': 'Brazil', 'City': 'Almenara'}}
{'_id': {'Country': 'Brazil', 'City': 'Bom Jesus dos Perdoes'}}
{'_id': {'Country': 'Brazil', 'City': 'Itabuna'}}
{'_id': {'Country': 'Brazil', 'City': 'Regente Feijo'}}
{'_id': {'Country': 'Brazil', 'City': 'Cruzeiro'}}
{'_id': {'Country': 'Brazil', 'City': 'Brotas'}}
{'_id': {'Country': 'Brazil', 'City': 'Marabá'}}
{'_id': {'Country': 'Brazil', 'City': 'Vila Velha'}}
{'_id': {'Country': 'Brazil', 'City': 'Diadema'}}
{'_id': {'Country': 'Brazil', 'City': 'Nova 

{'_id': {'Country': 'Canada', 'City': 'Paris'}}
{'_id': {'Country': 'Canada', 'City': 'Pakenham'}}
{'_id': {'Country': 'Canada', 'City': 'Victoriaville'}}
{'_id': {'Country': 'Canada', 'City': 'Minden'}}
{'_id': {'Country': 'Canada', 'City': 'Pubnico'}}
{'_id': {'Country': 'Canada', 'City': 'Lethbridge'}}
{'_id': {'Country': 'Canada', 'City': 'Dauphin'}}
{'_id': {'Country': 'Canada', 'City': 'Stettler'}}
{'_id': {'Country': 'Canada', 'City': 'Saskatoon'}}
{'_id': {'Country': 'Canada', 'City': 'Saint-Georges'}}
{'_id': {'Country': 'Canada', 'City': 'Happy Valley-Goose Bay'}}
{'_id': {'Country': 'Canada', 'City': 'Neguac'}}
{'_id': {'Country': 'Canada', 'City': 'Saint Romuald'}}
{'_id': {'Country': 'Canada', 'City': 'Kanata'}}
{'_id': {'Country': 'Canada', 'City': 'Saint-Constant'}}
{'_id': {'Country': 'Canada', 'City': 'St. Catharines'}}
{'_id': {'Country': 'Canada', 'City': 'Flesherton'}}
{'_id': {'Country': 'Canada', 'City': 'Prince George'}}
{'_id': {'Country': 'Canada', 'City': 'Bea

{'_id': {'Country': 'Finland', 'City': 'Isokyrö'}}
{'_id': {'Country': 'Finland', 'City': 'Karvala'}}
{'_id': {'Country': 'Finland', 'City': 'Saekylae'}}
{'_id': {'Country': 'Finland', 'City': 'Jyväskylä'}}
{'_id': {'Country': 'Finland', 'City': 'Nilsiä'}}
{'_id': {'Country': 'Finland', 'City': 'Kirkkonummi'}}
{'_id': {'Country': 'Finland', 'City': 'Järvenpää'}}
{'_id': {'Country': 'Finland', 'City': 'Joensuu'}}
{'_id': {'Country': 'Finland', 'City': 'Rusko'}}
{'_id': {'Country': 'Finland', 'City': 'Ähtäri'}}
{'_id': {'Country': 'Finland', 'City': 'Tampere'}}
{'_id': {'Country': 'Finland', 'City': 'Alavus'}}
{'_id': {'Country': 'Finland', 'City': 'Riihimäki'}}
{'_id': {'Country': 'Finland', 'City': 'Kuhmo'}}
{'_id': {'Country': 'Finland', 'City': 'Alaveteli'}}
{'_id': {'Country': 'Finland', 'City': 'Pernå'}}
{'_id': {'Country': 'Finland', 'City': 'Kangasala'}}
{'_id': {'Country': 'Finland', 'City': 'Mäntsälä'}}
{'_id': {'Country': 'Finland', 'City': 'Laukkoski'}}
{'_id': {'Country': 'F

{'_id': {'Country': 'France', 'City': 'Bas-en-Basset'}}
{'_id': {'Country': 'France', 'City': 'Laon'}}
{'_id': {'Country': 'France', 'City': 'Saint-Amand-Montrond'}}
{'_id': {'Country': 'France', 'City': 'Valreas'}}
{'_id': {'Country': 'France', 'City': 'Prunoy'}}
{'_id': {'Country': 'France', 'City': 'Pontoise'}}
{'_id': {'Country': 'France', 'City': 'Pertuis'}}
{'_id': {'Country': 'France', 'City': 'Petite-Foret'}}
{'_id': {'Country': 'France', 'City': 'Precy-sur-Oise'}}
{'_id': {'Country': 'France', 'City': 'Combloux'}}
{'_id': {'Country': 'France', 'City': 'Montceau-les-Mines'}}
{'_id': {'Country': 'France', 'City': 'Schirmeck'}}
{'_id': {'Country': 'France', 'City': 'Épinay-sur-Orge'}}
{'_id': {'Country': 'France', 'City': 'Saint-Marc-sur-Couesnon'}}
{'_id': {'Country': 'France', 'City': 'Neuilly-sur-Marne'}}
{'_id': {'Country': 'France', 'City': 'Crozon'}}
{'_id': {'Country': 'France', 'City': 'Champigny-sur-Marne'}}
{'_id': {'Country': 'France', 'City': 'Aurillac'}}
{'_id': {'Co

{'_id': {'Country': 'Germany', 'City': 'Uetersen'}}
{'_id': {'Country': 'Germany', 'City': 'Wasserburg'}}
{'_id': {'Country': 'Germany', 'City': 'Iphofen'}}
{'_id': {'Country': 'Germany', 'City': 'Ehingen'}}
{'_id': {'Country': 'Germany', 'City': 'Sussen'}}
{'_id': {'Country': 'Germany', 'City': 'Putzbrunn'}}
{'_id': {'Country': 'Germany', 'City': 'Linow'}}
{'_id': {'Country': 'Germany', 'City': 'Wilhelmsfeld'}}
{'_id': {'Country': 'Germany', 'City': 'Achtrup'}}
{'_id': {'Country': 'Germany', 'City': 'Meine'}}
{'_id': {'Country': 'Germany', 'City': 'Lauchhammer'}}
{'_id': {'Country': 'Germany', 'City': 'Bruchkoebel'}}
{'_id': {'Country': 'Germany', 'City': 'Aue'}}
{'_id': {'Country': 'Germany', 'City': 'Gipperath'}}
{'_id': {'Country': 'Germany', 'City': 'Agathenburg'}}
{'_id': {'Country': 'Germany', 'City': 'Moerfelden-Walldorf'}}
{'_id': {'Country': 'Germany', 'City': 'Kamenz'}}
{'_id': {'Country': 'Germany', 'City': 'Alzenau in Unterfranken'}}
{'_id': {'Country': 'Germany', 'City': 

{'_id': {'Country': 'India', 'City': 'Nagercoil'}}
{'_id': {'Country': 'India', 'City': 'Jabalpur'}}
{'_id': {'Country': 'India', 'City': 'Shimla'}}
{'_id': {'Country': 'India', 'City': 'Dharavi'}}
{'_id': {'Country': 'India', 'City': 'Nellore'}}
{'_id': {'Country': 'India', 'City': 'Amritsar'}}
{'_id': {'Country': 'India', 'City': 'Raichur'}}
{'_id': {'Country': 'India', 'City': 'Gaganpahad'}}
{'_id': {'Country': 'India', 'City': 'Sawai Madhopur'}}
{'_id': {'Country': 'India', 'City': 'Gopalnagar'}}
{'_id': {'Country': 'India', 'City': 'Vengurla'}}
{'_id': {'Country': 'India', 'City': 'Surendranagar'}}
{'_id': {'Country': 'India', 'City': 'Silvassa'}}
{'_id': {'Country': 'India', 'City': 'Bengaluru'}}
{'_id': {'Country': 'India', 'City': 'Jammu'}}
{'_id': {'Country': 'India', 'City': 'Nepanagar'}}
{'_id': {'Country': 'India', 'City': 'Caranzalem'}}
{'_id': {'Country': 'India', 'City': 'Arabaka'}}
{'_id': {'Country': 'India', 'City': 'Budhlada'}}
{'_id': {'Country': 'India', 'City': 'B

{'_id': {'Country': 'Israel', 'City': 'Sakhnin'}}
{'_id': {'Country': 'Israel', 'City': 'Yaziz'}}
{'_id': {'Country': 'Israel', 'City': 'Bat Shelomo'}}
{'_id': {'Country': 'Israel', 'City': 'Netivot'}}
{'_id': {'Country': 'Israel', 'City': "Yavne'el"}}
{'_id': {'Country': 'Israel', 'City': 'Segev Shalom'}}
{'_id': {'Country': 'Israel', 'City': 'Kiryat Ono'}}
{'_id': {'Country': 'Israel', 'City': 'Tel Aviv'}}
{'_id': {'Country': 'Israel', 'City': 'Jerusalem'}}
{'_id': {'Country': 'Israel', 'City': 'Ofakim'}}
{'_id': {'Country': 'Israel', 'City': 'Yated'}}
{'_id': {'Country': 'Israel', 'City': 'Migdal'}}
{'_id': {'Country': 'Israel', 'City': 'Modi‘in Makkabbim Re‘ut'}}
{'_id': {'Country': 'Israel', 'City': 'Shefayim'}}
{'_id': {'Country': 'Israel', 'City': 'Sarid'}}
{'_id': {'Country': 'Israel', 'City': 'Barqay'}}
{'_id': {'Country': 'Israel', 'City': 'Tamra'}}
{'_id': {'Country': 'Israel', 'City': 'Nazerat `Illit'}}
{'_id': {'Country': 'Israel', 'City': 'Netanya'}}
{'_id': {'Country': '

{'_id': {'Country': 'Italy', 'City': 'Civitanova Marche'}}
{'_id': {'Country': 'Italy', 'City': 'Pianoro'}}
{'_id': {'Country': 'Italy', 'City': 'Monfalcone'}}
{'_id': {'Country': 'Italy', 'City': 'Corridonia'}}
{'_id': {'Country': 'Italy', 'City': 'San Pancrazio Salentino'}}
{'_id': {'Country': 'Italy', 'City': 'Caronno Pertusella'}}
{'_id': {'Country': 'Italy', 'City': 'Felizzano'}}
{'_id': {'Country': 'Italy', 'City': 'Impruneta'}}
{'_id': {'Country': 'Italy', 'City': 'Varallo Sesia'}}
{'_id': {'Country': 'Italy', 'City': 'Melegnano'}}
{'_id': {'Country': 'Italy', 'City': 'Lucca'}}
{'_id': {'Country': 'Italy', 'City': 'Ragusa'}}
{'_id': {'Country': 'Italy', 'City': 'Castelfranco Emilia'}}
{'_id': {'Country': 'Italy', 'City': "Fiorenzuola d'Arda"}}
{'_id': {'Country': 'Italy', 'City': 'Desenzano del Garda'}}
{'_id': {'Country': 'Italy', 'City': 'Monghidoro'}}
{'_id': {'Country': 'Italy', 'City': 'Mirandola'}}
{'_id': {'Country': 'Italy', 'City': 'Pianezza'}}
{'_id': {'Country': 'Ital

{'_id': {'Country': 'Netherlands', 'City': 'Born'}}
{'_id': {'Country': 'Netherlands', 'City': 'Schalkhaar'}}
{'_id': {'Country': 'Netherlands', 'City': 'Elst'}}
{'_id': {'Country': 'Netherlands', 'City': 'Lith'}}
{'_id': {'Country': 'Netherlands', 'City': 'Nieuwe-Niedorp'}}
{'_id': {'Country': 'Netherlands', 'City': 'Dinxperlo'}}
{'_id': {'Country': 'Netherlands', 'City': 'Asten'}}
{'_id': {'Country': 'Netherlands', 'City': 'Delft'}}
{'_id': {'Country': 'Netherlands', 'City': 'Heerlen'}}
{'_id': {'Country': 'Netherlands', 'City': 'Keijenborg'}}
{'_id': {'Country': 'Netherlands', 'City': 'Hattem'}}
{'_id': {'Country': 'Netherlands', 'City': 'Naaldwijk'}}
{'_id': {'Country': 'Netherlands', 'City': 'Kampen'}}
{'_id': {'Country': 'Netherlands', 'City': 'Avenhorn'}}
{'_id': {'Country': 'Netherlands', 'City': 'Nieuweroord'}}
{'_id': {'Country': 'Netherlands', 'City': 'Wierden'}}
{'_id': {'Country': 'Netherlands', 'City': 'Grootegast'}}
{'_id': {'Country': 'Netherlands', 'City': 'Harskamp'}}

{'_id': {'Country': 'Poland', 'City': 'Ustronie'}}
{'_id': {'Country': 'Poland', 'City': 'Biłgoraj'}}
{'_id': {'Country': 'Poland', 'City': 'Skulsk'}}
{'_id': {'Country': 'Poland', 'City': 'Dziecmierowo'}}
{'_id': {'Country': 'Poland', 'City': 'Kętrzyn'}}
{'_id': {'Country': 'Poland', 'City': 'Mirsk'}}
{'_id': {'Country': 'Poland', 'City': 'Znin'}}
{'_id': {'Country': 'Poland', 'City': 'Deblin'}}
{'_id': {'Country': 'Poland', 'City': 'Lesnica'}}
{'_id': {'Country': 'Poland', 'City': 'Koło'}}
{'_id': {'Country': 'Poland', 'City': 'Pisz'}}
{'_id': {'Country': 'Poland', 'City': 'Racibórz'}}
{'_id': {'Country': 'Poland', 'City': 'Naglowice'}}
{'_id': {'Country': 'Poland', 'City': 'Sokółka'}}
{'_id': {'Country': 'Poland', 'City': 'Dobron'}}
{'_id': {'Country': 'Poland', 'City': 'Sabnie'}}
{'_id': {'Country': 'Poland', 'City': 'Rogozno'}}
{'_id': {'Country': 'Poland', 'City': 'Sosnowiec'}}
{'_id': {'Country': 'Poland', 'City': 'Bardo'}}
{'_id': {'Country': 'Poland', 'City': 'Będzin'}}
{'_id'

{'_id': {'Country': 'Romania', 'City': 'Zlatna'}}
{'_id': {'Country': 'Romania', 'City': 'Râmnicu Vâlcea'}}
{'_id': {'Country': 'Romania', 'City': 'Darabani'}}
{'_id': {'Country': 'Romania', 'City': 'Mizil'}}
{'_id': {'Country': 'Romania', 'City': 'Darmanesti'}}
{'_id': {'Country': 'Romania', 'City': 'Astileu'}}
{'_id': {'Country': 'Romania', 'City': 'Prejmer'}}
{'_id': {'Country': 'Romania', 'City': 'Motru'}}
{'_id': {'Country': 'Romania', 'City': 'Pantelimon'}}
{'_id': {'Country': 'Romania', 'City': 'Alexandria'}}
{'_id': {'Country': 'Romania', 'City': 'Branceni'}}
{'_id': {'Country': 'Romania', 'City': 'Rosu'}}
{'_id': {'Country': 'Romania', 'City': 'Moreni'}}
{'_id': {'Country': 'Romania', 'City': 'Domnesti'}}
{'_id': {'Country': 'Romania', 'City': 'Bucharest'}}
{'_id': {'Country': 'Romania', 'City': 'Beiuș'}}
{'_id': {'Country': 'Romania', 'City': 'Somcuta Mare'}}
{'_id': {'Country': 'Romania', 'City': 'Maneciu-Ungureni'}}
{'_id': {'Country': 'Romania', 'City': 'Alba Iulia'}}
{'_i

{'_id': {'Country': 'Spain', 'City': 'Gibraleón'}}
{'_id': {'Country': 'Spain', 'City': 'Montehermoso'}}
{'_id': {'Country': 'Spain', 'City': 'Manresa'}}
{'_id': {'Country': 'Spain', 'City': 'San Juan de Aznalfarache'}}
{'_id': {'Country': 'Spain', 'City': 'Figueruelas'}}
{'_id': {'Country': 'Spain', 'City': 'Ciudad Rodrigo'}}
{'_id': {'Country': 'Spain', 'City': 'Formentera de Segura'}}
{'_id': {'Country': 'Spain', 'City': 'Palencia'}}
{'_id': {'Country': 'Spain', 'City': 'Castelldefels'}}
{'_id': {'Country': 'Spain', 'City': 'Mérida'}}
{'_id': {'Country': 'Spain', 'City': 'Alpedrete'}}
{'_id': {'Country': 'Spain', 'City': 'Vila'}}
{'_id': {'Country': 'Spain', 'City': 'Navalmoral de la Mata'}}
{'_id': {'Country': 'Spain', 'City': 'Sineu'}}
{'_id': {'Country': 'Spain', 'City': 'Olloniego'}}
{'_id': {'Country': 'Spain', 'City': 'Sentmenat'}}
{'_id': {'Country': 'Spain', 'City': 'Algete'}}
{'_id': {'Country': 'Spain', 'City': 'Benifaio'}}
{'_id': {'Country': 'Spain', 'City': 'Liencres'}}

{'_id': {'Country': 'Switzerland', 'City': 'Benken'}}
{'_id': {'Country': 'Switzerland', 'City': 'Satigny'}}
{'_id': {'Country': 'Switzerland', 'City': 'Forel'}}
{'_id': {'Country': 'Switzerland', 'City': 'Niederrohrdorf'}}
{'_id': {'Country': 'Switzerland', 'City': 'Oberdiessbach'}}
{'_id': {'Country': 'Switzerland', 'City': 'Saint-Jean'}}
{'_id': {'Country': 'Switzerland', 'City': 'Ober Urdorf'}}
{'_id': {'Country': 'Switzerland', 'City': 'Attalens'}}
{'_id': {'Country': 'Switzerland', 'City': 'Rorschach'}}
{'_id': {'Country': 'Switzerland', 'City': 'Wabern'}}
{'_id': {'Country': 'Switzerland', 'City': 'Sursee'}}
{'_id': {'Country': 'Switzerland', 'City': 'Goldach'}}
{'_id': {'Country': 'Switzerland', 'City': 'Lyss'}}
{'_id': {'Country': 'Switzerland', 'City': 'Herzogenbuchsee'}}
{'_id': {'Country': 'Switzerland', 'City': 'Nidau'}}
{'_id': {'Country': 'Switzerland', 'City': 'Roggwil'}}
{'_id': {'Country': 'Switzerland', 'City': 'Greifensee'}}
{'_id': {'Country': 'Switzerland', 'City'

{'_id': {'Country': 'United Kingdom', 'City': 'Craigavon'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Ashford'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Prudhoe'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Brierley Hill'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Ware'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Tilbury'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Cheadle'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Bonnybridge'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Soham'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Berwick-Upon-Tweed'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Rayleigh'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Sidcup'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Ramsgate'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Forres'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Woodford Green'}}
{'_id': {'Country': 'United Kingdom', 'City': 'Hindhead'}}
{'_id': {'Country': 'United Kingdom', 'City': 

{'_id': {'Country': 'United States', 'City': 'Beaver Falls'}}
{'_id': {'Country': 'United States', 'City': 'Xenia'}}
{'_id': {'Country': 'United States', 'City': 'Accident'}}
{'_id': {'Country': 'United States', 'City': 'La Junta'}}
{'_id': {'Country': 'United States', 'City': 'Branford'}}
{'_id': {'Country': 'United States', 'City': 'Scarsdale'}}
{'_id': {'Country': 'United States', 'City': 'New Boston'}}
{'_id': {'Country': 'United States', 'City': 'Forest Hills'}}
{'_id': {'Country': 'United States', 'City': 'Millburn'}}
{'_id': {'Country': 'United States', 'City': 'Kingfisher'}}
{'_id': {'Country': 'United States', 'City': 'Lake Ariel'}}
{'_id': {'Country': 'United States', 'City': 'Ontario'}}
{'_id': {'Country': 'United States', 'City': 'Callao'}}
{'_id': {'Country': 'United States', 'City': 'Carl Junction'}}
{'_id': {'Country': 'United States', 'City': 'Raeford'}}
{'_id': {'Country': 'United States', 'City': 'Kimberly'}}
{'_id': {'Country': 'United States', 'City': 'Travelers Res

{'_id': {'Country': 'United States', 'City': 'Lewes'}}
{'_id': {'Country': 'United States', 'City': 'Toledo'}}
{'_id': {'Country': 'United States', 'City': 'Girdwood'}}
{'_id': {'Country': 'United States', 'City': 'Boydton'}}
{'_id': {'Country': 'United States', 'City': 'Lake Hopatcong'}}
{'_id': {'Country': 'United States', 'City': 'Swayzee'}}
{'_id': {'Country': 'United States', 'City': 'Alexander City'}}
{'_id': {'Country': 'United States', 'City': 'Yadkinville'}}
{'_id': {'Country': 'United States', 'City': 'Alice'}}
{'_id': {'Country': 'United States', 'City': 'Burt'}}
{'_id': {'Country': 'United States', 'City': 'Machesney Park'}}
{'_id': {'Country': 'United States', 'City': 'Charles Town'}}
{'_id': {'Country': 'United States', 'City': 'North Attleboro'}}
{'_id': {'Country': 'United States', 'City': 'Acton'}}
{'_id': {'Country': 'United States', 'City': 'Cromwell'}}
{'_id': {'Country': 'United States', 'City': 'Hattiesburg'}}
{'_id': {'Country': 'United States', 'City': 'Wahiawa'

{'_id': {'Country': 'United States', 'City': 'Huntington Park'}}
{'_id': {'Country': 'United States', 'City': 'McHenry'}}
{'_id': {'Country': 'United States', 'City': 'North Highlands'}}
{'_id': {'Country': 'United States', 'City': 'Epsom'}}
{'_id': {'Country': 'United States', 'City': 'Pottsville'}}
{'_id': {'Country': 'United States', 'City': 'Wetumpka'}}
{'_id': {'Country': 'United States', 'City': 'Lyndhurst'}}
{'_id': {'Country': 'United States', 'City': 'Lake in the Hills'}}
{'_id': {'Country': 'United States', 'City': 'Crown Point'}}
{'_id': {'Country': 'United States', 'City': 'Mamaroneck'}}
{'_id': {'Country': 'United States', 'City': 'Nashville'}}
{'_id': {'Country': 'United States', 'City': 'Channahon'}}
{'_id': {'Country': 'United States', 'City': 'Lake Zurich'}}
{'_id': {'Country': 'United States', 'City': 'El Monte'}}
{'_id': {'Country': 'United States', 'City': 'Seneca'}}
{'_id': {'Country': 'United States', 'City': 'Maceo'}}
{'_id': {'Country': 'United States', 'City': 

---
Unique `OS`

In [13]:
db.clicks.find().distinct('device.OS')

['Android',
 'BlackBerry OS',
 'Chrome OS',
 'Chromecast',
 'Fedora',
 'FreeBSD',
 'Kindle',
 'Linux',
 'Mac OS X',
 'NetBSD',
 'OpenBSD',
 'Other',
 'Solaris',
 'Tizen',
 'Ubuntu',
 'Windows',
 'Windows Phone',
 'iOS']

In [14]:
len(db.clicks.find().distinct('device.OS'))

18

---
Unique `Browser`

In [15]:
db.clicks.find().distinct('device.Browser')

['AdsBot-Google',
 'AhrefsBot',
 'Amazon Silk',
 'Android',
 'AppEngine-Google',
 'Apple Mail',
 'BingPreview',
 'BlackBerry WebKit',
 'Chrome',
 'Chrome Mobile',
 'Chrome Mobile WebView',
 'Chrome Mobile iOS',
 'Chromium',
 'Coc Coc',
 'Coveobot',
 'Crosswalk',
 'Dragon',
 'DuckDuckBot',
 'Edge',
 'Edge Mobile',
 'Electron',
 'Epiphany',
 'Facebook',
 'FacebookBot',
 'Firefox',
 'Firefox Mobile',
 'Firefox iOS',
 'HbbTV',
 'HeadlessChrome',
 'HubSpot Crawler',
 'IE',
 'IE Mobile',
 'Iceweasel',
 'Iron',
 'JobBot',
 'Jooblebot',
 'K-Meleon',
 'Kindle',
 'Konqueror',
 'Magus Bot',
 'Mail.ru Chromium Browser',
 'Maxthon',
 'Mobile Safari',
 'Mobile Safari UI/WKWebView',
 'MobileIron',
 'NetFront',
 'Netscape',
 'Opera',
 'Opera Coast',
 'Opera Mini',
 'Opera Mobile',
 'Other',
 'PagePeeker',
 'Pale Moon',
 'PetalBot',
 'PhantomJS',
 'Pinterest',
 'Puffin',
 'Python Requests',
 'QQ Browser',
 'QQ Browser Mobile',
 'Radius Compliance Bot',
 'Safari',
 'Samsung Internet',
 'SeaMonkey',
 'Se

In [16]:
len(db.clicks.find().distinct('device.Browser'))

82

---
Unique `Activity`

In [17]:
db.clicks.find().distinct('Activity')

['click', 'pageload']

----
Number of unique prodcuts

In [18]:
len(db.clicks.find().distinct('ProductID'))

10938

---
# Queries

----

----
### Q1
Get the percentage of documents where there are user ids present. Also get the percentage of documents where the user ids are absent.

***Hint - Keep total documents count in a separate variable beforehand.***

In [19]:
db.clicks.find_one()

{'_id': ObjectId('60df1029ad74d9467c91a932'),
 'webClientID': 'WI100000244987',
 'VisitDateTime': datetime.datetime(2018, 5, 25, 4, 51, 14, 179000),
 'ProductID': 'Pr100037',
 'Activity': 'click',
 'device': {'Browser': 'Firefox', 'OS': 'Windows'},
 'user': {'City': 'Colombo', 'Country': 'Sri Lanka'}}

In [20]:
# Find total documents in the collection
total_docs_count = db.clicks.count_documents({})
print(total_docs_count)

6100000


In [24]:
cur = db.clicks.aggregate([
                
            # Stage 1 - check if user.UserID is present
            {
                # fill the operator to use
                '$match': {'user.UserID': {'$exists': True}}
            },

            # Stage 2- count the filtered documents
            {
                # fill in the stage operator to use
                '$count': 'signed_up'
            },

            # Stage 3 - find the percentage
            {
                '$project': {
                    'Percentage_signed_up': {'$multiply':[
                                                            {'$divide': ['$signed_up', total_docs_count]}, 
                                                            100
                                                        ]
                                            },
                    'Percentage_not_signed_up': {'$multiply':[
                                                                # fill the value
                                                                {'$divide': [
                                                                                # users who haven't signed up
                                                                                {'$subtract': [total_docs_count, '$signed_up']}, 
                                                                                total_docs_count
                                                                            ]
                                                                },
                                                                100
                                                            ]
                                                }
                }
            }
        ],
        allowDiskUse=True)

for doc in cur:
    pp.pprint(doc)

{'Percentage_signed_up': 9.873655737704919,
 'Percentage_not_signed_up': 90.12634426229508}


----
### Q2
What was the most popular product?

In [25]:
cur = db.clicks.aggregate([
    
            # Stage 1 - group on the ProductID field
            {
                # fill in the stage operator to use
                '$group': {
                                '_id': '$ProductID',
                                'Count': {'$sum': 1}
                            }
            },
    
            # Stage 2 - sort the groups based on the Count
            {
                "$sort":{"Count":-1}
            },
    
            # Stage 3 - limit the result to return the product with the highest count
            {
                "$limit":1
            }
        ])

for doc in cur:
    pp.pprint(doc)

{'_id': 'Pr100017', 'Count': 157922}


----
### Q3
Count the number of `click` and number of `pageload` for each signed up user. Sort the result in ascending order of the `user.UserID`.

In [27]:
cur = db.clicks.aggregate([
            {"$match":{"user.UserID":{"$exists":"True"}
    }},
    { "$group": {
        "_id": {"User":"$user.UserID","Activity":"$Activity"},
        "count": { "$sum": 1 }
    }},
    {"$sort":{"User":1}}
])

for doc in cur:
    pp.pprint(doc)

{'_id': {'User': 'U119612', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U122540', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U108475', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U107855', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U134906', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U118268', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U116142', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U109785', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U121715', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U108481', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U130464', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U112082', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U102741', 'Activity': 'click'}, 'count': 15}
{'_id': {'User': 'U124548', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U132000', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U126278', 'Activity':

{'_id': {'User': 'U118397', 'Activity': 'click'}, 'count': 10}
{'_id': {'User': 'U111000', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U106784', 'Activity': 'pageload'}, 'count': 9}
{'_id': {'User': 'U133776', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U111463', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U123114', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U110502', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U120772', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U117044', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U135040', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U122476', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U135101', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U127946', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U130163', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U110014', 'Activity': 'pageload'}, 'count': 37}
{'_id': {'User': 'U111113', 'Activity': 'pag

{'_id': {'User': 'U122930', 'Activity': 'pageload'}, 'count': 10}
{'_id': {'User': 'U111999', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U102039', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U108490', 'Activity': 'click'}, 'count': 46}
{'_id': {'User': 'U105340', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U128505', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U114466', 'Activity': 'click'}, 'count': 21}
{'_id': {'User': 'U128541', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U126794', 'Activity': 'click'}, 'count': 13}
{'_id': {'User': 'U100013', 'Activity': 'pageload'}, 'count': 10}
{'_id': {'User': 'U112099', 'Activity': 'click'}, 'count': 10}
{'_id': {'User': 'U116865', 'Activity': 'pageload'}, 'count': 12}
{'_id': {'User': 'U129912', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U120175', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U122722', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U130175', 'Activity':

{'_id': {'User': 'U101653', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U102728', 'Activity': 'pageload'}, 'count': 24}
{'_id': {'User': 'U111166', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U115361', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U126348', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U117633', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U125668', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U125816', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U107864', 'Activity': 'click'}, 'count': 17}
{'_id': {'User': 'U112853', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U122046', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U120006', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U121660', 'Activity': 'pageload'}, 'count': 9}
{'_id': {'User': 'U115110', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U101311', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U100718', 'Activity': '

{'_id': {'User': 'U103288', 'Activity': 'click'}, 'count': 49}
{'_id': {'User': 'U107953', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U120262', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U105856', 'Activity': 'click'}, 'count': 10}
{'_id': {'User': 'U113216', 'Activity': 'pageload'}, 'count': 13}
{'_id': {'User': 'U107493', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U119060', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U110942', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U103917', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U103535', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U118618', 'Activity': 'pageload'}, 'count': 14}
{'_id': {'User': 'U128741', 'Activity': 'click'}, 'count': 61}
{'_id': {'User': 'U129342', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U105987', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U114727', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U113383', 'Activity'

{'_id': {'User': 'U127851', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U120997', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U118658', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U130031', 'Activity': 'click'}, 'count': 15}
{'_id': {'User': 'U118777', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U110204', 'Activity': 'click'}, 'count': 11}
{'_id': {'User': 'U113700', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U109635', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U101281', 'Activity': 'click'}, 'count': 12}
{'_id': {'User': 'U123952', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U108577', 'Activity': 'click'}, 'count': 11}
{'_id': {'User': 'U102246', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U124130', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U122633', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U134137', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U128213', 'Activity': 'page

{'_id': {'User': 'U101429', 'Activity': 'pageload'}, 'count': 115}
{'_id': {'User': 'U115306', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U129257', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U134161', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U111844', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U129337', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U111507', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U110226', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U117589', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U134523', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U102026', 'Activity': 'click'}, 'count': 91}
{'_id': {'User': 'U132498', 'Activity': 'click'}, 'count': 43}
{'_id': {'User': 'U108093', 'Activity': 'pageload'}, 'count': 9}
{'_id': {'User': 'U135334', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U132907', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U120731', 'Activity': 'pageloa

{'_id': {'User': 'U116781', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U103935', 'Activity': 'click'}, 'count': 12}
{'_id': {'User': 'U121112', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U135095', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U108159', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U130333', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U128832', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U126024', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U103377', 'Activity': 'click'}, 'count': 10}
{'_id': {'User': 'U106919', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U127466', 'Activity': 'click'}, 'count': 26}
{'_id': {'User': 'U101852', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U131574', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U104618', 'Activity': 'click'}, 'count': 28}
{'_id': {'User': 'U134773', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U124089', 'Activit

{'_id': {'User': 'U131683', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U109248', 'Activity': 'click'}, 'count': 67}
{'_id': {'User': 'U117815', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U126841', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U104296', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U120219', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U116288', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U108458', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U106878', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U106351', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U114031', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U135127', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U127677', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U101041', 'Activity': 'pageload'}, 'count': 17}
{'_id': {'User': 'U102171', 'Activity': 'pageload'}, 'count': 113}
{'_id': {'User': 'U105619', 'A

{'_id': {'User': 'U115404', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U107371', 'Activity': 'click'}, 'count': 30}
{'_id': {'User': 'U121667', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U118686', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U121568', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U110613', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U102476', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U123771', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U117198', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U135910', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U125733', 'Activity': 'pageload'}, 'count': 9}
{'_id': {'User': 'U126459', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U134343', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U131523', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U129357', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U113896', 'Activity': 'clic

{'_id': {'User': 'U133253', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U101232', 'Activity': 'click'}, 'count': 15}
{'_id': {'User': 'U135082', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U107391', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U102499', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U105351', 'Activity': 'click'}, 'count': 62}
{'_id': {'User': 'U114102', 'Activity': 'pageload'}, 'count': 117}
{'_id': {'User': 'U134297', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U114460', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U121578', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U113736', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U104428', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U108241', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U106890', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U127921', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U104548', 'Activit

{'_id': {'User': 'U135870', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U131197', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U126592', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U114097', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U117469', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U106314', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U118557', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U127951', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U119574', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U129634', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U112631', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U131188', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U118351', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U106950', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U119320', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U125637', 'Activity': 

{'_id': {'User': 'U124845', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U110579', 'Activity': 'click'}, 'count': 10}
{'_id': {'User': 'U121491', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U125995', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U125591', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U113003', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U128684', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U104148', 'Activity': 'pageload'}, 'count': 25}
{'_id': {'User': 'U101581', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U128695', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U110730', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U115624', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U127894', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U106332', 'Activity': 'click'}, 'count': 13}
{'_id': {'User': 'U122145', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U109799', 'Activity': 'pagel

{'_id': {'User': 'U115529', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U135009', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U135495', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U117496', 'Activity': 'pageload'}, 'count': 11}
{'_id': {'User': 'U125046', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U121069', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U103630', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U132550', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U125534', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U118713', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U116173', 'Activity': 'click'}, 'count': 32}
{'_id': {'User': 'U113171', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U107008', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U119026', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U112956', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U130392', 'Activity': '

{'_id': {'User': 'U111242', 'Activity': 'pageload'}, 'count': 43}
{'_id': {'User': 'U114839', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U121987', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U115230', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U105483', 'Activity': 'pageload'}, 'count': 20}
{'_id': {'User': 'U122873', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U101568', 'Activity': 'pageload'}, 'count': 37}
{'_id': {'User': 'U102327', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U133283', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U119667', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U135149', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U101507', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U132496', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U105417', 'Activity': 'click'}, 'count': 14}
{'_id': {'User': 'U133135', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U112583', 'Activity': 'p

{'_id': {'User': 'U129974', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U132744', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U135755', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U106840', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U111151', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U129844', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U126636', 'Activity': 'click'}, 'count': 17}
{'_id': {'User': 'U127844', 'Activity': 'pageload'}, 'count': 11}
{'_id': {'User': 'U111872', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U113009', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U119707', 'Activity': 'click'}, 'count': 10}
{'_id': {'User': 'U116579', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U130512', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U132994', 'Activity': 'click'}, 'count': 10}
{'_id': {'User': 'U103938', 'Activity': 'click'}, 'count': 34}
{'_id': {'User': 'U109198', 'Activity': '

{'_id': {'User': 'U120016', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U124784', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U101918', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U116404', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U125367', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U132577', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U111401', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U127164', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U129028', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U116456', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U101779', 'Activity': 'pageload'}, 'count': 18}
{'_id': {'User': 'U131046', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U111575', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U108154', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U108155', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U117784', 'Activity': 'pageloa

{'_id': {'User': 'U124224', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U119821', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U127836', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U135142', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U133610', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U119422', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U129188', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U122844', 'Activity': 'pageload'}, 'count': 14}
{'_id': {'User': 'U116049', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U135211', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U124272', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U109584', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U133475', 'Activity': 'pageload'}, 'count': 16}
{'_id': {'User': 'U114666', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U122582', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U127292', 'Activity': 'cli

{'_id': {'User': 'U128104', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U133616', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U129029', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U117917', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U117881', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U128790', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U124147', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U109060', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U124372', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U125723', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U100751', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U131710', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U131522', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U103218', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U112949', 'Activity': 'pageload'}, 'count': 10}
{'_id': {'User': 'U113859', 'Activity': 'click'}, 'cou

{'_id': {'User': 'U126678', 'Activity': 'click'}, 'count': 11}
{'_id': {'User': 'U110077', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U132249', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U128125', 'Activity': 'pageload'}, 'count': 47}
{'_id': {'User': 'U126966', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U102065', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U133529', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U117681', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U105423', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U120581', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U113580', 'Activity': 'pageload'}, 'count': 18}
{'_id': {'User': 'U115416', 'Activity': 'click'}, 'count': 21}
{'_id': {'User': 'U135306', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U133402', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U109736', 'Activity': 'click'}, 'count': 42}
{'_id': {'User': 'U119262', 'Activity': '

{'_id': {'User': 'U121491', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U104066', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U112482', 'Activity': 'click'}, 'count': 13}
{'_id': {'User': 'U133792', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U105962', 'Activity': 'click'}, 'count': 13}
{'_id': {'User': 'U136401', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U109208', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U135172', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U117759', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U110170', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U118665', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U101441', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U129178', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U126925', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U112658', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U110028', 'Activity': 'pageload'}, 

{'_id': {'User': 'U113457', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U134551', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U112344', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U128061', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U129692', 'Activity': 'click'}, 'count': 30}
{'_id': {'User': 'U131468', 'Activity': 'click'}, 'count': 38}
{'_id': {'User': 'U108500', 'Activity': 'pageload'}, 'count': 24}
{'_id': {'User': 'U118869', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U123024', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U102053', 'Activity': 'click'}, 'count': 28}
{'_id': {'User': 'U124654', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U128974', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U133581', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U108513', 'Activity': 'pageload'}, 'count': 75}
{'_id': {'User': 'U111876', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U126848', 'Activity': '

{'_id': {'User': 'U112806', 'Activity': 'pageload'}, 'count': 17}
{'_id': {'User': 'U110995', 'Activity': 'pageload'}, 'count': 10}
{'_id': {'User': 'U123831', 'Activity': 'click'}, 'count': 11}
{'_id': {'User': 'U121853', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U128267', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U116457', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U134787', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U117979', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U124827', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U125465', 'Activity': 'click'}, 'count': 15}
{'_id': {'User': 'U134371', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U129966', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U111079', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U118432', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U121009', 'Activity': 'click'}, 'count': 11}
{'_id': {'User': 'U107299', 'Activity': 'pag

{'_id': {'User': 'U128344', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U117316', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U100190', 'Activity': 'click'}, 'count': 26}
{'_id': {'User': 'U121673', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U106446', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U129701', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U103312', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U123357', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U107661', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U103618', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U100042', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U118197', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U105129', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U120655', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U101366', 'Activity': 'click'}, 'count': 34}
{'_id': {'User': 'U108605', 'Activity': 'pagelo

{'_id': {'User': 'U101502', 'Activity': 'pageload'}, 'count': 45}
{'_id': {'User': 'U126415', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U130377', 'Activity': 'pageload'}, 'count': 9}
{'_id': {'User': 'U115581', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U124325', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U109653', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U127224', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U113760', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U117295', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U116421', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U120455', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U106970', 'Activity': 'pageload'}, 'count': 14}
{'_id': {'User': 'U120110', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U107802', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U110884', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U123606', 'Activi

{'_id': {'User': 'U115352', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U124843', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U134323', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U124560', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U108460', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U135191', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U130903', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U110434', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U111628', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U135591', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U133200', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U127306', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U116722', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U101361', 'Activity': 'pageload'}, 'count': 35}
{'_id': {'User': 'U112890', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U118841', 'Activity': 'page

{'_id': {'User': 'U119445', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U123977', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U118721', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U117834', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U106160', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U129226', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U128586', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U128342', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U132958', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U121105', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U107262', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U115364', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U124086', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U121917', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U126655', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U115349', 'Activity

{'_id': {'User': 'U101095', 'Activity': 'click'}, 'count': 24}
{'_id': {'User': 'U118456', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U122977', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U125457', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U130740', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U103257', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U105467', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U133428', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U105041', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U125434', 'Activity': 'click'}, 'count': 15}
{'_id': {'User': 'U126721', 'Activity': 'click'}, 'count': 11}
{'_id': {'User': 'U112183', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U122848', 'Activity': 'click'}, 'count': 33}
{'_id': {'User': 'U103100', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U101338', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U100696', 'Activity': 'page

{'_id': {'User': 'U129906', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U113986', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U116415', 'Activity': 'pageload'}, 'count': 13}
{'_id': {'User': 'U100457', 'Activity': 'click'}, 'count': 68}
{'_id': {'User': 'U135895', 'Activity': 'pageload'}, 'count': 11}
{'_id': {'User': 'U112674', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U135911', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U134652', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U126850', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U102731', 'Activity': 'click'}, 'count': 15}
{'_id': {'User': 'U114343', 'Activity': 'click'}, 'count': 52}
{'_id': {'User': 'U124674', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U112971', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U106471', 'Activity': 'click'}, 'count': 23}
{'_id': {'User': 'U117783', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U133290', 'Activity

{'_id': {'User': 'U125464', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U101847', 'Activity': 'click'}, 'count': 102}
{'_id': {'User': 'U136958', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U132143', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U114917', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U126938', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U136763', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U106052', 'Activity': 'pageload'}, 'count': 17}
{'_id': {'User': 'U115502', 'Activity': 'pageload'}, 'count': 9}
{'_id': {'User': 'U133119', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U121840', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U134776', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U115760', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U116788', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U119397', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U102904', 'Activity': 'pagel

{'_id': {'User': 'U105931', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U109910', 'Activity': 'pageload'}, 'count': 7}
{'_id': {'User': 'U130178', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U107175', 'Activity': 'pageload'}, 'count': 172}
{'_id': {'User': 'U102746', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U131074', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U112371', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U111492', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U105888', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U130816', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U135003', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U134912', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U121999', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U129566', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U120184', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U110705', '

{'_id': {'User': 'U122936', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U104168', 'Activity': 'pageload'}, 'count': 9}
{'_id': {'User': 'U115240', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U123687', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U123718', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U129613', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U101272', 'Activity': 'click'}, 'count': 14}
{'_id': {'User': 'U102599', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U125542', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U133155', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U121704', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U103667', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U109454', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U125633', 'Activity': 'pageload'}, 'count': 8}
{'_id': {'User': 'U133348', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U107283', 'Acti

{'_id': {'User': 'U106507', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U135340', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U128292', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U127503', 'Activity': 'click'}, 'count': 16}
{'_id': {'User': 'U130645', 'Activity': 'click'}, 'count': 12}
{'_id': {'User': 'U121083', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U107389', 'Activity': 'click'}, 'count': 11}
{'_id': {'User': 'U104138', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U134647', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U106265', 'Activity': 'click'}, 'count': 21}
{'_id': {'User': 'U110073', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U135415', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U129151', 'Activity': 'click'}, 'count': 33}
{'_id': {'User': 'U107241', 'Activity': 'pageload'}, 'count': 14}
{'_id': {'User': 'U117663', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U136765', 'Activity': 

{'_id': {'User': 'U135125', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U120860', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U112007', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U123812', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U113251', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U133733', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U114499', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U103668', 'Activity': 'click'}, 'count': 46}
{'_id': {'User': 'U106358', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U109069', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U126560', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U101193', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U108237', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U100789', 'Activity': 'click'}, 'count': 13}
{'_id': {'User': 'U130456', 'Activity': 'pageload'}, 'count': 13}
{'_id': {'User': 'U133078', 'Activity': 'pageload'},

{'_id': {'User': 'U117148', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U105686', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U106330', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U133157', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U123234', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U119435', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U116944', 'Activity': 'click'}, 'count': 55}
{'_id': {'User': 'U134066', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U123592', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U108357', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U119117', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U107104', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U106286', 'Activity': 'click'}, 'count': 6}
{'_id': {'User': 'U125082', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U115437', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U107525', 'Activity': 'pageloa

{'_id': {'User': 'U109703', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U130369', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U121401', 'Activity': 'click'}, 'count': 5}
{'_id': {'User': 'U104712', 'Activity': 'pageload'}, 'count': 38}
{'_id': {'User': 'U130053', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U132859', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U110358', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U108921', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U111404', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U118503', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U114730', 'Activity': 'pageload'}, 'count': 23}
{'_id': {'User': 'U108739', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U123226', 'Activity': 'click'}, 'count': 12}
{'_id': {'User': 'U129077', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U125724', 'Activity': 'click'}, 'count': 19}
{'_id': {'User': 'U104721', 'Activit

{'_id': {'User': 'U112552', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U133039', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U113706', 'Activity': 'click'}, 'count': 20}
{'_id': {'User': 'U126093', 'Activity': 'click'}, 'count': 9}
{'_id': {'User': 'U123972', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U126119', 'Activity': 'pageload'}, 'count': 4}
{'_id': {'User': 'U104137', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U132383', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U119906', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U123607', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U116918', 'Activity': 'pageload'}, 'count': 47}
{'_id': {'User': 'U112967', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U130229', 'Activity': 'pageload'}, 'count': 9}
{'_id': {'User': 'U129729', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U114801', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U102088', 'Activity': '

{'_id': {'User': 'U104460', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U109246', 'Activity': 'click'}, 'count': 12}
{'_id': {'User': 'U124614', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U135617', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U102370', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U114190', 'Activity': 'click'}, 'count': 11}
{'_id': {'User': 'U125609', 'Activity': 'click'}, 'count': 4}
{'_id': {'User': 'U100149', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U118506', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U114791', 'Activity': 'pageload'}, 'count': 2}
{'_id': {'User': 'U105840', 'Activity': 'pageload'}, 'count': 6}
{'_id': {'User': 'U105886', 'Activity': 'click'}, 'count': 170}
{'_id': {'User': 'U112042', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U125984', 'Activity': 'pageload'}, 'count': 11}
{'_id': {'User': 'U125236', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U114167', 'Activity'

{'_id': {'User': 'U129520', 'Activity': 'pageload'}, 'count': 5}
{'_id': {'User': 'U131841', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U136330', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U125543', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U134782', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U125467', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U124210', 'Activity': 'click'}, 'count': 3}
{'_id': {'User': 'U111947', 'Activity': 'click'}, 'count': 2}
{'_id': {'User': 'U119534', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U131113', 'Activity': 'click'}, 'count': 1}
{'_id': {'User': 'U102203', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U136733', 'Activity': 'pageload'}, 'count': 1}
{'_id': {'User': 'U100846', 'Activity': 'click'}, 'count': 8}
{'_id': {'User': 'U118123', 'Activity': 'pageload'}, 'count': 3}
{'_id': {'User': 'U114308', 'Activity': 'click'}, 'count': 7}
{'_id': {'User': 'U136032', 'Activity': 'click'},

---
### Q4
Count the number of records per `Country`. Sort the result in descending order of number of records.

In [28]:
cur = db.clicks.aggregate([
           { "$group": {
        "_id": "$user.Country",
        "count": { "$sum": 1 }
           }},
    {"$sort":{"count":-1}}
        ])

for doc in cur:
    pp.pprint(doc)

{'_id': 'India', 'count': 2663843}
{'_id': 'United States', 'count': 833389}
{'_id': 'United Kingdom', 'count': 162125}
{'_id': None, 'count': 129875}
{'_id': 'Germany', 'count': 115674}
{'_id': 'Singapore', 'count': 107007}
{'_id': 'Canada', 'count': 100649}
{'_id': 'Australia', 'count': 94601}
{'_id': 'France', 'count': 86974}
{'_id': 'Turkey', 'count': 86459}
{'_id': 'Republic of Korea', 'count': 73706}
{'_id': 'Vietnam', 'count': 72577}
{'_id': 'Pakistan', 'count': 68222}
{'_id': 'Brazil', 'count': 66953}
{'_id': 'Malaysia', 'count': 62935}
{'_id': 'Netherlands', 'count': 57270}
{'_id': 'Italy', 'count': 50987}
{'_id': 'Spain', 'count': 49761}
{'_id': 'Indonesia', 'count': 46098}
{'_id': 'Philippines', 'count': 46006}
{'_id': 'Russia', 'count': 44945}
{'_id': 'Hong Kong', 'count': 43101}
{'_id': 'Taiwan', 'count': 40926}
{'_id': 'Poland', 'count': 39704}
{'_id': 'China', 'count': 35177}
{'_id': 'Egypt', 'count': 33660}
{'_id': 'Thailand', 'count': 33591}
{'_id': 'Israel', 'count': 

----
### Q5
What is the most common/frequently used `OS`? 

And, what is the most common/frequently used `Browser`? 

Also get the count for both.

In [None]:
db.clicks.find_one()

In [29]:
OS = db.clicks.aggregate([
           { "$group": {
        "_id": "$device.OS",
        "count": { "$sum": 1 }
           }},
    {"$sort":{"count":-1}},
    {"$limit":1}
        ])

for doc in OS:
    pp.pprint(doc)

{'_id': 'Windows', 'count': 3931349}


In [30]:
browser = db.clicks.aggregate([
           { "$group": {
        "_id": "$device.Browser",
        "count": { "$sum": 1 }
           }},
    {"$sort":{"count":-1}},
    {"$limit":1}
        ])
for doc in browser:
    pp.pprint(doc)

{'_id': 'Chrome', 'count': 4360498}


In [None]:
cur = db.clicks.aggregate([
    
            # Stage - Sub-pipeline for each result
            {
                # Fill in the stage operator
                '__fill__': {
                    
                                'Most_used_OS': [
                                                    # Enter your code here
                                                ],

                                'Most_used_Browser': [
                                                        # Enter your code here
                                                    ]
                            }
            }
        ])

for doc in cur:
    pp.pprint(doc)

---
### Q6
What is most common `OS` and `Browser` combination used by the users? Also get the count for it.

`Example - {'Linux', 'Firefox'}`

In [31]:
cur = db.clicks.aggregate([
             { "$group": {
        "_id": {"OS":"$device.OS","browser":"$device.Browser"},
        "count": { "$sum": 1 }
           }},
           {"$sort":{"count":-1}},
           {"$limit":1}
])

for doc in cur:
    pp.pprint(doc)

{'_id': {'OS': 'Windows', 'browser': 'Chrome'}, 'count': 3589891}


----
### Q7
How many unique users were active in each week from  07/05/2018 - 27/05/2018?

That is, how may unique users visited in the week from 07/05/2018 - 14/05/2018, from 15/05/2018 - 21/05/2018, and so on.

***Hint - You will need the `$addToSet` operator of `$bucket` stage operator and you will need to use the `$size` aggregation operator.***

In [None]:
cur = db.clicks.aggregate([
    
            # Stage 1 - filter documents for signed up users
            {
                "$match":{"VisitDateTime":{"$gte":datetime(2018,5,7),"$lte":datetime(2018,5,27)}}
            },
    
            # Stage 2 - bucket by each week from 07/05/2018 - 27/05/2018
            {
                '$bucket': {
                                # fill in the value
                                'groupBy': '$VisitDateTime',
                    
                                'boundaries': [datetime(2018,5,7),datetime(2018,5,15),datetime(2018,5,22),datetime(2018,5,28)] 
                                            ,
                                # use $addToSet group accumulator operator
                                #"$addToSet": '$user.UserID' ,
                                # to get an array of unique users for each week
                                'output': {
                                                "Number of Users":{"$sum":1}
                                        }
                            }
            }
        ],
        allowDiskUse=True)

for doc in cur:
    pp.pprint(doc)

----
### Q8
From all the unique users who visited between 07/05/2018 - 15/05/2018 dates, who visited the most number of times? Also get a list of unique products viewed by that user in the same duration.

***Hint - Use the `$addToSet` group accumulator operator.***

In [None]:
db.clicks.find_one()

In [None]:
cur = db.clicks.aggregate([
            
            # Stage 1 - filter the documents between the dates and where the user id exists
            {
                '$match': { 
                             "$and":[{"VisitDateTime":{"$gte":datetime(2018,5,7),"$lte":datetime(2018,5,15)}},{"user.UserID":{"$exists":"True"}}]   
                            }
            },
    
            # Stage 2 - group the documents on the user id,
            #           count how many times that user visited the website,
            #           get an array of unique products viewed by user
            {
                '$group': {
                            '_id': '$user.UserID',
                            'Count': {'$sum': 1},
                    
                            # get an array of unique products viewed by the user
                            'Products':{"$addToSet":"$ProductID"}
                        }
            },
            
            # Stage 3 - sort the group by the count of visits
            {
                "$sort":{"Count":-1}
            },
            
            # Stage 4 - return the user who visited the most
            {
                "$limit":1
            }
        ])

for doc in cur:
    pp.pprint(doc)

---
### Q9
Get the number of times each unique product was viewed by the user from the previous question in the same time duration.

In [None]:
cur = db.clicks.aggregate([
            {
                '$match': { 
                                # specify the date range
                                'VisitDateTime': {
                                                    "$gte":datetime(2018,5,7),"$lte":datetime(2018,5,15)
                                                },
                                
                                # fill in with the user id from previous query
                                'user.UserID': 'U134751'
                            }
            },
            {
                '$group': {
                            '_id': '$ProductID',
                            'Count': {'$sum': 1}
                        }
            }
        ])

for doc in cur:
    pp.pprint(doc)

------
### Q10
What is the last viewed product by each signed up user till 27/05/2018?

***Hint - Use the `$last` group accumulator operator.***

In [None]:
cur = db.clicks.aggregate([            
          # Stage 1 - filter documents where the user id exists
            {"$match":
             {
                 "user.UserID":{"$exists":"True"}
             }
            },    
            # Stage 2 - sort documents by user id and date of visit
            {
                '$sort':{"user.UserId":1,"VisitDateTime":-1}
            },
    
            # Stage 3 - group on the user id and find the last product viewed
            {
                '$group': {
                            # fill the following
                            '_id': "$user.UserId",
                            # get the last viewed product
                            'Last_product_viewed':{"$last":"$ProductId"}
                        }
            },
        ],
        allowDiskUse=True)
for doc in cur:
    pp.pprint(doc)