### Background
This is the second part of studying Python packages and in this notebook, the packages included are 
time, datetime, pickle, JSON, bs4, spllit3, functools, fractions, collections, itertools and sympy, the simple ones.

**1. time-Time access and conversions**

Notice, this is different from the time subclass in datetime module, although there's some relation to that.

The *epoch* is the point where the time starts. On January 1st of that year, at 0 hours, the “time since the epoch” is zero. For Unix, the epoch is 1970. To find out what the epoch is, look at gmtime(0).

UTC: Coordinated universal time, no mistake here, a compromise to French.

DST: Daylight saving time



In [1]:
import time

In [3]:
time.altzon

14400

In [4]:
14400/3600  # This is 4 hours behind, substracting 4 hours from UTC time

4.0

In [11]:
import datetime
time.gmtime(int(time.time()))

time.struct_time(tm_year=2017, tm_mon=3, tm_mday=12, tm_hour=23, tm_min=33, tm_sec=8, tm_wday=6, tm_yday=71, tm_isdst=0)

In [10]:
time.time()

1489361582.2049851

In [12]:
time.clock()

0.779985

In [13]:
time.clock_getres()

AttributeError: module 'time' has no attribute 'clock_getres'

In [16]:
time.CLOCK_HIGHRES

AttributeError: module 'time' has no attribute 'CLOCK_HIGHRES'

In [17]:
time.CLOCK_MONOTONIC

AttributeError: module 'time' has no attribute 'CLOCK_MONOTONIC'

In [18]:
time.CLOCK_MONOTONIC_RAW

AttributeError: module 'time' has no attribute 'CLOCK_MONOTONIC_RAW'

In [19]:
import time

In [21]:
dir(time)

['_STRUCT_TM_ITEMS',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'altzone',
 'asctime',
 'clock',
 'ctime',
 'daylight',
 'get_clock_info',
 'gmtime',
 'localtime',
 'mktime',
 'monotonic',
 'perf_counter',
 'process_time',
 'sleep',
 'strftime',
 'strptime',
 'struct_time',
 'time',
 'timezone',
 'tzname',
 'tzset']

In [23]:
time.monotonic()

17450.498356746

In [24]:
time.perf_counter()

17478.341076603

In [25]:
# Suspend execution of the calling thread for the given number of seconds.
time.sleep(10)  # sleeping 10 seconds; it can be a floating number

In [26]:
time.time()  # Return the time in seconds since the epoch as a floating point number.

1489374844.0703502

In [27]:
time.timezone

18000

In [28]:
time.tzname

('EST', 'EDT')

In [29]:
time.tzset('UTC')

TypeError: tzset() takes no arguments (1 given)

In [31]:
print(time.CLOCK_REALTIME)  # Availability: UNIX

AttributeError: module 'time' has no attribute 'CLOCK_REALTIME'

This concludes the session of module *time*. Nothing interesting, and perhaps the most important function to use now is time.time() which returns the epoch time and maybe the timezone switches can be used in multiple exchanges over the world. Many constants are available on UNIX, which reveals the importance of UNIX as the system for servers. On Mac, I don't get a chance to see these constants.

*datetime* is the topic in this session, however, a detailed walk-through has been given in another notebook. Here I'm going to skip this and refer to another notebook for details.

*pickle* is the next topic.

The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” [1] or “flattening”; however, to avoid confusion, the terms used here are “pickling” and “unpickling”.

The reason I got to learn about this module is for the purpose of saving intermediate data structures from a program for future use, with a similar concept workspace in R in mind, however, the pickle module doesn't help save the entire workspace, but only specific objects in the workspace.

A similar module *marshal* also exists, but it's less useful than this one and I'll skip it.

In [32]:
import pickle

In [33]:
pickle.HIGHEST_PROTOCOL

4

In [34]:
pickle.DEFAULT_PROTOCOL

3

In [35]:
pickle.dump

<function _pickle.dump>

In [36]:
pickle.load

<function _pickle.load>

In [38]:
import pickle

a = {'hello': 'world'}

with open('filename.pickle', 'wb') as handle:
    pickle.dump(a, handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('filename.pickle', 'rb') as handle:
    b = pickle.load(handle)

print(a == b)

True


I only used pickle in working on the OGVHQ engine. I created a dictionary for each eventID and saved these dictionaries in pickles one for each eventID. Then load them up individually for each simulation program.

The next one is *json*. This is an important one, since json has become a very useful text format.

**JSON** - JSON encoder and decoder

JSON: JaveScript Object Notation

In [39]:
import json
json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])

'["foo", {"bar": ["baz", null, 1.0, 2]}]'

In [40]:
print(json.dumps("\"foo\bar"))

"\"foo\bar"


In [41]:
print(json.dumps('\u1234'))

"\u1234"


In [42]:
print(json.dumps('\\'))

"\\"


In [43]:
print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True))

{"a": 0, "b": 0, "c": 0}


In [44]:
from io import StringIO
io = StringIO()
json.dump(['streaming API'], io)
io.getvalue()

'["streaming API"]'

In [45]:
json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',', ':'))

'[1,2,3,{"4":5,"6":7}]'

In [46]:
print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4))

{
    "4": 5,
    "6": 7
}


It has the basic usage with pickle, which is dump, dumps, load, loads.

The next is **bs4**: Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.

Beautiful Soup transforms a complex HTML document into a complex tree of Python objects. But you’ll only ever have to deal with about four kinds of objects: Tag, NavigableString, BeautifulSoup, and Comment.

In [47]:
html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>

<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>

<p class="story">...</p>
"""

In [49]:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')

print(soup.prettify())

<html>
 <head>
  <title>
   The Dormouse's story
  </title>
 </head>
 <body>
  <p class="title">
   <b>
    The Dormouse's story
   </b>
  </p>
  <p class="story">
   Once upon a time there were three little sisters; and their names were
   <a class="sister" href="http://example.com/elsie" id="link1">
    Elsie
   </a>
   ,
   <a class="sister" href="http://example.com/lacie" id="link2">
    Lacie
   </a>
   and
   <a class="sister" href="http://example.com/tillie" id="link3">
    Tillie
   </a>
   ;
and they lived at the bottom of a well.
  </p>
  <p class="story">
   ...
  </p>
 </body>
</html>


In [50]:
soup.title

<title>The Dormouse's story</title>

In [51]:
soup.head

<head><title>The Dormouse's story</title></head>

In [52]:
soup.body

<body>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their names were
<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
<a class="sister" href="http://example.com/lacie" id="link2">Lacie</a> and
<a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>
<p class="story">...</p>
</body>

In [53]:
soup.title.name

'title'

In [54]:
soup.title.string

"The Dormouse's story"

In [55]:
soup.title.parent.name


'head'

In [56]:
soup.p

<p class="title"><b>The Dormouse's story</b></p>

In [57]:
soup.p['class']

['title']

In [58]:
soup.a

<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>

In [59]:
soup.find_all('a')

[<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
 <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>,
 <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]

In [60]:
soup.find(id="link3")

<a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>

In [61]:
for link in soup.find_all('a'):
    print(link.get('href'))

http://example.com/elsie
http://example.com/lacie
http://example.com/tillie


In [62]:
for link in soup.find_all('a'):
    print(link.get('id'))

link1
link2
link3


The next is **sqlite3**: DB-API 2.0 interface for SQLite databases. Details can be found [here](https://docs.python.org/3.5/library/sqlite3.html). More to come later.

The next is **functools**: 