<a href="https://colab.research.google.com/github/Giffy/MongoDB_PyMongo_Tutorial/blob/master/1_2_Bicing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Example 1 - Bicing stations

In [47]:
# MongoDB download and installation
!wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-debian71-3.0.15.tgz  # Downloads MongoDB from official repository
!tar xfv mongodb-linux-x86_64-debian71-3.0.15.tgz  >/dev/null                    # Unpack compressed file
!rm mongodb-linux-x86_64-debian71-3.0.15.tgz                                     # Removes downloaded file

# dataset = "https://www.bicing.cat/availability_map/getJsonObject"     # Get JSON file from bicing
dataset = "https://raw.githubusercontent.com/Giffy/MongoDB_PyMongo_Tutorial/master/resources/bicing_data.csv"  
!wget $dataset                                                                   # gets_dataset

# Uploading data to Mongo Database
!mongodb-linux-x86_64-debian71-3.0.15/bin/mongoimport --host brny4kjelauboxl-mongodb.services.clever-cloud.com \
                                                      --port 27017 \
                                                      --username='u1kkdrchfjim80tclysv' \
                                                      --password='FeesC2ACNmI7be61RTst' \
                                                      --db brny4kjelauboxl \
                                                      --collection bicing \
                                                      --type csv\
                                                      --file bicing_data.csv\
                                                      --drop --headerline

--2020-03-24 17:21:10--  https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-debian71-3.0.15.tgz
Resolving fastdl.mongodb.org (fastdl.mongodb.org)... 13.32.85.31, 13.32.85.204, 13.32.85.119, ...
Connecting to fastdl.mongodb.org (fastdl.mongodb.org)|13.32.85.31|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 70878938 (68M) [application/x-gzip]
Saving to: ‘mongodb-linux-x86_64-debian71-3.0.15.tgz’


2020-03-24 17:21:15 (14.4 MB/s) - ‘mongodb-linux-x86_64-debian71-3.0.15.tgz’ saved [70878938/70878938]

--2020-03-24 17:21:20--  https://raw.githubusercontent.com/Giffy/MongoDB_PyMongo_Tutorial/master/resources/bicing_data.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 89518 (87K) [text/plain]
Saving to: ‘bicing_data.csv.2’


2020-03-24 

##1. Install Pymongo

In [0]:
!pip install pymongo==3.7.2 folium==0.8.3  >/dev/null      # Install PyMongo and folium for map visualization

##2. Import libraries

In [0]:
import pymongo                            # Library to access MongoDB
from pymongo import MongoClient           # Imports MongoClient 
import pandas as pd                       # Library to work with dataframes
import folium                             # Library to visualize a map

##3. Connect to database

In [0]:
# uri (uniform resource identifier) defines the connection parameters 
# uri = 'mongodb:// USER : PASSWORD @ SERVER_NAME : PORT / DATABASENAME')
# uri = 'mongodb:// USER : PASSWORD @ SERVER_NAME : PORT / DATABASE_NAME, CLUSTER_1_NAME : PORT , CLUSTER_2_NAME : PORT')
# uri = 'localhost:27017'
uri = 'mongodb://u1kkdrchfjim80tclysv:FeesC2ACNmI7be61RTst@brny4kjelauboxl-mongodb.services.clever-cloud.com:27017/brny4kjelauboxl'

# start client to connect to MongoDB server 
client = MongoClient( uri )

In [0]:
db = client.brny4kjelauboxl               # Set the database to work on
db.list_collection_names()                # List the collections available
collection = db.bicing                    # Collection alias

##4. Quick data overview

In [52]:
num_documents = collection.count_documents({'_id' : {'$exists' : 1}})     # Counts the documents in database
print ( 'Number of documents in database = ' + str(num_documents) )
list ( collection.find().limit(1) )                                       # Shows the first document

Number of documents in database = 926


[{'_id': ObjectId('5e7a41938307b5e3d4de02e4'),
  'altitude': 21,
  'bikes': 25,
  'id': 1,
  'latitude': 41.397952,
  'longitude': 2.180042,
  'nearbyStations': '24, 369, 387, 426',
  'slots': 2,
  'status': 'OPN',
  'streetName': 'Gran Via Corts Catalanes',
  'streetNumber': 760,
  'type': 'BIKE',
  'updateTime': '01/08/18 17:43:08'}]

In [0]:
# The values of 'bikes' is string type instead of number. 
# In order to filter by number greater than, we need to convert the value to integer.
# to change an attribute type, can be done easily with mongoDB 4 
# In workshop we use mongo 3, I suggest to use the following method to convert it

bikes_list = list(collection.distinct('bikes'))             # list the unique values of 'bikes', we get a list of strings 
for num in bikes_list:                                      # iterate the list, item by item
  collection.update_many({'bikes' : num},{'$set': {'bikes' : int(num)}})    # update each document with a number in string with the same number as Integer


##5. Query to database:  Get active stations with at least 3 bicycles

In [0]:
# Loading database query in pandas Dataframe
filters = {'status':'OPN', 'bikes' : {'$gte' : 3 }}   # Usage of gte Query Operator  $gte = "greater than or equal"
fields = { '_id', 'latitude' , 'longitude', 'bikes', 'slots'}

query = list( collection.find( filters , fields ) )
df = pd.DataFrame ( query )                             # Load the database reply in a Pandas DataFrame

In [55]:
print ( 'Numer of active stations with at least 3 bicycles: ' + str(len (query)) )

Numer of active stations with at least 3 bicycles: 662


In [56]:
df.iloc[0] # prints the first DataFrame row 

_id          5e7a41938307b5e3d4de02e4
latitude                       41.398
longitude                     2.18004
slots                               2
bikes                              25
Name: 0, dtype: object

##6. Mark Bicing stations in map

In [57]:
center_lat = 41.378
center_lon = 2.139

locationmap = folium.Map(location=[ center_lat , center_lon ], zoom_start=16, width=800, height=600 )
longitud  = len( df )

for i in range ( longitud ):
    lng = float(df.iloc[i]['longitude'])
    lat = float(df.iloc[i]['latitude'])
    description = 'Bikes: ' + str(df.iloc[i]['bikes']) + '<br> Empty slots: ' + str(df.iloc[i]['slots'])
    folium.Marker( [ lat , lng ],
                 popup= description,
                 icon=folium.Icon(color='red')).add_to(locationmap)

locationmap