# Dive into Python. Part III




**Agenda:**

    * modules
    * packages
    * envs
    * peps
    * pip
    * logging

Python is not just language itself. Its power comes from community and tools around it too.

# Module

Modules in Python are simply Python files with a .py or .pyc extension. The name of the module will be the name of the file. A Python module can have a set of functions, classes or variables defined and implemented. In the example above, we will have two files, we will have:

In [1]:
# Go to creating a module.py

In [2]:
# Import module
import module


# Call function
module.world()

Hello, World!


We could instead import the module as from hello import world and call the function directly as world(). You can learn more about this method by reading how to using from ... import when importing modules. 

In [3]:
# Import module
from module import world


# Call function
world()

Hello, World!


In [4]:
!python3 module.py

Script executed!


In [10]:
# Import module
from module import stats
import random


arr = [random.random() for e in range(10)]
print(arr)
s = stats()
print(s.mean(arr))
print(s.median(arr))

[0.21778115547154786, 0.688484208011857, 0.3930914003773508, 0.003038595668745092, 0.4777858485320512, 0.5105700324443067, 0.4976274529049619, 0.4955520383293158, 0.7873700120352265, 0.12825096285744086]
0.41995517066328036
0.4866689434306835


The first time a module is loaded into a running Python script, it is initialized by executing the code in the module once. If another module in your code imports the same module again, it will not be loaded twice but once only - so local variables inside the module act as a "singleton" - they are initialized only once.

This is useful to know, because this means that you can rely on this behavior for initializing objects.

# Exploring built-in modules

https://docs.python.org/3/library/

In [16]:
import sys
print('version {}.{}.{}'.format(*sys.version_info))  # gets version of the Python language

version 3.5.2


In [12]:
import os
os.name  # <-- get operating system mame

'posix'

In [13]:
os.getenv('PATH')  # <-- gets env variable

'/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'

In [14]:
os.getcwd()  # <-- gets current directory

'/home/dlab-user'

In [17]:
import random
random.randint(1, 1000)  # <-- returns random number in range 1-1000

717

In [20]:
x = list(range(10))
print(x)
random.shuffle(x)
print(x)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[7, 3, 8, 5, 1, 0, 4, 6, 9, 2]


In [25]:
from datetime import datetime
now = datetime.now()
print(now)
print(now.year, now.month, now.day)

2018-05-23 08:09:59.845811
2018 5 23


In [27]:
import math
math.log(10)

2.302585092994046

In [29]:
math.sqrt(144)

12.0

# Pip

 Pip is a package management system used to install and manage software packages, such as those found in the Python Package Index(https://pypi.org/). 

For python2 

```
pip install <package-name>
```

For python3

```
pip3 install <package-name>
```

In [37]:
!sudo pip3 install seaborn

[33mThe directory '/home/dlab-user/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.[0m
[33mThe directory '/home/dlab-user/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.[0m
Collecting seaborn
Installing collected packages: seaborn
Successfully installed seaborn-0.8.1
[33mYou are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [38]:
!pip3 install seaborn==0.8.1

[33mYou are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


For upgrading version of the package:
    
```
pip3 install --upgrade <package-name>
```

For removing package:
    
```
pip3 uninstall <package-name>
```

Search PyPI for packages

```
pip3 search "<keyword>"
```

In [31]:
!pip3 search "query"

date-query (0.10.2)                                         - A program to
                                                              query dates
json-query (0.0.2)                                          - JSON Query tools
juju-query (0.0.1)                                          - Juju query
                                                              charmstore
nameko-query (0.0.2)                                        - Query extension
                                                              for nameko.
graphite-query (0.11.3)                                     - Utilities for
                                                              querying
                                                              graphite's
                                                              database
version-query (1.0.1)                                       - Package version
                                                              query toolkit
                

[33mYou are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


To get info about an installed package, including its location and files:

```
pip3 show <package-name>
```

In [33]:
!pip3 show pandas

Name: pandas
Version: 0.22.0
Summary: Powerful data structures for data analysis, time series,and statistics
Home-page: http://pandas.pydata.org
Author: The PyData Development Team
Author-email: pydata@googlegroups.com
License: BSD
Location: /usr/local/lib/python3.5/dist-packages
Requires: numpy, python-dateutil, pytz
[33mYou are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


## Virtualenv

Sometimes you want to create isolated environment for particular project. Then 

`virtualenv` is a tool to create isolated Python environments. `virtualenv` creates a folder which contains all the necessary executables to use the packages that a Python project would need.

Install virtualenv via pip:

```
pip3 install virtualenv
```

Then

```
cd my_project_folder
virtualenv my_project
```

`virtualenv <project-name>` will create a folder in the current directory which will contain the Python executable files, and a copy of the pip library which you can use to install other packages. The name of the virtual environment (in this case, it was my_project) can be anything; omitting the name will place the files in the current directory instead.

This creates a copy of Python in whichever directory you ran the command in, placing it in a folder named my_project.

You can also use the Python interpreter of your choice (like python2.7).

```
virtualenv -p /usr/bin/python2.7 my_project
```

To begin using the virtual environment, it needs to be activated:

```
source my_project/bin/activate
```

Install packages as usual, for example:

```
pip3 install requests
```

If you are done working in the virtual environment for the moment, you can deactivate it:

```
deactivate
```

This puts you back to the system’s default Python interpreter with all its installed libraries.

To delete a virtual environment, just delete its folder. (In this case, it would be `rm -rf my_project`.)

In order to keep your environment consistent, it’s a good idea to “freeze” the current state of the environment packages. To do this, run

```
pip3 freeze > requirements.txt
```

This will create a requirements.txt file, which contains a simple list of all the packages in the current environment, and their respective versions. You can see the list of installed packages without the requirements format using “pip list”. Later it will be easier for a different developer (or you, if you need to re-create the environment) to install the same packages using the same versions:

```
pip3 install -r requirements.txt
```

In [1]:
!pip3 freeze

absl-py==0.1.12
asn1crypto==0.24.0
astor==0.6.2
azure-common==1.1.8
azure-nspkg==2.0.0
azure-storage-blob==1.1.0
azure-storage-common==1.1.0
azure-storage-nspkg==3.0.0
bcrypt==3.1.4
bleach==1.5.0
blinker==1.3
boto==2.48.0
boto3==1.6.17
botocore==1.9.17
BRA==1.3
bz2file==0.98
cairocffi==0.7.2
certifi==2018.1.18
cffi==1.11.5
chardet==3.0.4
click==6.7
cloud-init==17.1
cloudpickle==0.5.2
command-not-found==0.3
configobj==5.0.6
cryptography==2.2.1
cycler==0.10.0
Cython==0.28.1
dask==0.17.1
decorator==4.2.1
distributed==1.21.4
Django==2.0.5
docutils==0.14
entrypoints==0.2.3
Fabric==1.14.0
fabric-virtualenv==0.3.0
fabvenv==0.2.1
fastparquet==0.1.4
first==2.0.1
funcsigs==0.4
gast==0.2.0
gensim==3.4.0
grpcio==1.10.0
h5py==2.7.1
HeapDict==1.0.0
html5lib==0.9999999
idna==2.6
ipykernel==4.8.2
ipython==6.2.1
ipython-genutils==0.2.0
ipywidgets==7.2.1
jedi==0.11.1
Jinja2==2.8
jmespath==0.9.3
joblib==0.11
jsonpatch==1.10
jsonpointer==1.9
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==4.4.0
jupyter-c

## virtualenvwrapper

`virtualenvwrapper` provides a set of commands which makes working with virtual environments much more pleasant. It also places all your virtual environments in one place.

To install (make sure virtualenv is already installed):

```
pip3 install virtualenvwrapper
export WORKON_HOME=~/Envs
source /usr/local/bin/virtualenvwrapper.sh
```

For creating a virtual environment:

```
mkvirtualenv my_project
```

This creates the my_project folder inside ~/Envs.

Work on a virtual environment:

```
workon my_project
```

`virtualenvwrapper` provides tab-completion on environment names. It really helps when you have a lot of environments and have trouble remembering their names.

`workon` also deactivates whatever environment you are currently in, so you can quickly switch between environments.

Deactivating is still the same:

```
deactivate
```

To delete:

```
rmvirtualenv venv
```

More: https://virtualenvwrapper.readthedocs.io/en/latest/index.html

# Logging

In [2]:
import logging

In [24]:
logger = logging.getLogger('')
if (logger.hasHandlers()):
    logger.handlers.clear()
handler = logging.StreamHandler()
formatter = logging.Formatter('%(levelname)-8s [%(asctime)s]  %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.INFO)
logger.info('Some useful info')

INFO     [2018-05-24 11:20:37,427]  Some useful info


# tqdm

In [27]:
from tqdm import tqdm
for i in tqdm(range(int(9e6))):
    pass

100%|██████████| 9000000/9000000 [00:02<00:00, 3455166.68it/s]


In [29]:
with tqdm(total=100) as pbar:
    for i in range(10):
        pbar.update(10)

100%|██████████| 100/100 [00:00<00:00, 98411.64it/s]
