### Best practices:
* not let python load the csv headers - rather set it manually
* add stuff to dicts viac dict.update(), not directly dict['new_key'] = 'new_value'
* Abstract Base Classes for Containers - https://docs.python.org/3/library/collections.abc.html
* mypy - http://mypy-lang.org/
* pyflame - https://github.com/uber/pyflame
* monkey type - https://github.com/Instagram/MonkeyType
* pipenv - https://github.com/pypa/pipenv

In [47]:
from time import time
time0 = time()
for i in range(int(1e8)):
    pass
print(time() - time0)

3.406897783279419


### `is` compare object ids:

In [36]:
a = 2
id(a)

4485335760

In [37]:
b = 2
id(b)

4485335760

In [27]:
b is a

True

### Emojis

In [39]:
from emoji import emojize
emojize('Matus works for :pill:')

'Matus works for 💊'

### Feedback on entry task
Best practices:
* use pipfile, requirements.txt (management)
* functions, classes
* tests
* dockerfile (why?)

### Requests, html

In [57]:
import requests
import lxml.html as html
import re

r = requests.get('https://en.wikipedia.org/wiki/Tasmania')
tree = html.fromstring(r.text)
# number of li elements
len(tree.cssselect('li'))

612

In [69]:
# Beutiful soup
from bs4 import BeautifulSoup

link = 'https://en.wikipedia.org/wiki/Tasmania'
page = requests.get(link)
soup = BeautifulSoup(page.text, 'html.parser')

### Alsa scraping

In [183]:
from pprint import pprint
from urllib.parse import parse_qs, urlparse, urlencode
import requests_html
url = 'https://www.alsa.com/en/web/bus/home'

user_dep = 'Madrid (All stops)'
user_dest = 'Barcelona (All stops)'
user_time_dep = '09/13/2018' # mm/dd/yyyy
user_passengers = 2

url_locations = 'https://www.alsa.com/en/c/portal/layout?p_l_id=70167&p_p_cacheability=cacheLevelPage&p_p_id=JourneySearchPortlet_WAR_Alsaportlet&p_p_lifecycle=2&p_p_resource_id=JsonGetOrigins&locationMode=1&_=1536399884758'
locations_json = requests.get(url_locations).json()

def get_id(city):
    for location in locations_json:
        if location['name'] == city:
            return location['id']

# build query
url_template = 'https://www.alsa.com/en/web/bus/checkout?p_auth=9wzyfRYp&p_p_id=PurchasePortlet_WAR_Alsaportlet&p_p_lifecycle=1&p_p_state=normal&p_p_mode=view&p_p_col_id=column-1&p_p_col_count=3&_PurchasePortlet_WAR_Alsaportlet_javax.portlet.action=searchJourneysAction&code=&serviceType=&accessible=0&originStationNameId=Madrid+(All+stops)&destinationStationNameId=Barcelona+(All+stops)&originStationId=90155&destinationStationId=90595&departureDate=09%2F09%2F2018&_departureDate=09%2F09%2F2018&returnDate=&_returnDate=&locationMode=1&passengerType-1=1&passengerType-4=0&passengerType-5=0&passengerType-2=0&passengerType-3=0&numPassengers=1&regionalZone=&travelType=OUTWARD&LIFERAY_SHARED_isTrainTrip=false&promoCode=&jsonAlsaPassPassenger=&jsonVoucherPassenger='
query_dict = parse_qs(urlparse(url_template).query)
for key, value in query_dict.items():
    query_dict[key] = value[0]

# update query
query_dict['originStationNameId'] = user_dep
query_dict['originStationId'] = get_id(user_dep)
query_dict['destinationStationNameId'] = user_dest
query_dict['destinationStationId'] = get_id(user_dest)
query_dict['departureDate'] = user_time_dep
query_dict['_departureDate'] = user_time_dep
query_dict['passengerType-1'] = user_passengers

# pprint(query_dict)
query_string = urlencode(query_dict)

from requests_html import HTMLSession
session = HTMLSession()
page = session.get('https://www.alsa.com/en/web/bus/checkout?', params=query_string)
url_results = page.html.find('data-sag-journeys-component', first=True).attrs['sag-journeys-table-body-url']

results_json = session.get(url_results).json()
result = results_json['journeys'][0]

output = {'departure_time': result['departureDataToFilter'],
         'arrival_time': result['arrivalDataToFilter'],
         'from': result['originName'],
         'to': result['destinationName'],
         'type': 'bus' if 'busCharacteristic' in result else 'train',
         'price': result['fares'][0]['price'],
         }

pprint(output)


{'arrival_time': '13/09/2018 07:19',
 'departure_time': '13/09/2018 00:14',
 'from': 'Madrid - Barajas Airport T4',
 'price': 90.0,
 'to': 'Barcelona Sants Station',
 'type': 'bus'}


### Redis database
* stores stuff in RAM
* as a pip package for P3 - redis
* useful for caching - when loading from cache and missing, fetch from source and save in cache

In [263]:
from redis import StrictRedis

redis_config = {
    'host': '35.198.72.72',
    'port': 3389
}

redis = StrictRedis(socket_connect_timeout=3, **redis_config)

redis.set('Exponea power', True)
redis.get('Exponea power')

b'True'

In [289]:
redis.get('city_id_barcelona_all_stops')

b'90595'

In [287]:
redis.keys('*madrid*')

[b'journey_id_madrid-all-stops',
 b'madrid_all_stops',
 b'city_id_madrid',
 b'city_id_aeropuerto-madrid-barajas-t4',
 b'city_id_madrid_all_stops',
 b'aero_madrid_barajas_t2_llegada']

In [279]:
redis.get('journey_94600_94600_2018-10-20').decode('utf-8')

'{"dep": "2018-10-20 01:00:00", "arr": "2018-10-20 08:35:00", "dst": "Madrid - Barajas Airport T4", "src": "Barcelona Estaci\\u00f3n Nord", "type": "bus", "price": 32.71, "dst_id": 5555, "src_id": 595}'

### API
* technologies cacommunicate with each other
* using http methods Get Post

In [278]:
'34/34/34'.replace('/', '-')

'34-34-34'

### Delivering great app
* automate as much as I can
* 
* Datadog - collects pings from application, creates metrics, dashboards, graphs, alarms. Works with python, it's free
* Sentry - collects errors, integrated with Flask, it's free
* Elastic search - not relational database, extremely powerful
* black - code formatter, deals with quotes, breaking lines, brackets,...
* coala - 