# Object-Oriented Programming - Part 2

This is the second Jupyter Notebook on OOP. Previously the main building blocks for OOP have been introduced. 

From here, the focus lies on an organisation in modules, i.e. Modularized Code. The main parts are therefore written in separate `.py` scripts. 

This notebook will serve as a guide and for making notes on the topic.

## Organizing into Modules

- In Python, a module is a single Python file that contains a collection of functions, classes, and or global variables.
- A package is essentially a collection of modules places into a directory

- [virtual machines in the cloud](https://aws.amazon.com/de/getting-started/tutorials/launch-a-virtual-machine/)
- The Distribution and Gaussian code was refactored into individual modules. 

**Next:**
Convert Distributions code into a Python package. A python package also needs an `__init__.py` file. 

> Create this __init__.py file and then pip install the package into your local Python installation. 

#### How to setup files so they can be pip installed?

Folder structure:

```
├── distributions
│    └── Gaussiandistribution.py
│    └── Generaldistribution.py
│    └── __init__.py
│     
└── setup.py
```

- Create a folder for your package, organised like in the shown folder structure above.
- Create new subfolder with all the modules you want in your package, i.e. `distributions`
 - Folder needs to contain an `__init__.py` file. A package always needs an init file. 
- Create a `setup.py` file. This file is necessary for pip installing.

Difference in `Gaussiandistribution.py` to previous files we have used: 

```python
from .Generaldistribution import Distribution
```

- Note that there is a dot in front.

#### `__init__.py` 

- `__init__.py`: Always required, even if empty. This code gets run whenver you import a package inside a Python program. 

This specific file only includes the following code: 

```python
from .Gaussiandistribution import Gaussian
```

The line enables to import the Gaussian class directly. 

#### `setup.py`:

- includes metadata about the package, like name, version, description, etc. 

```python 
from setuptools import setup

setup(name="distribution", 
      version="0.1",
      description="Gaussian distribution",
      packages=["distribution"],
      zip_safe=False)
```

#### How to install this package?

- Go into a terminal and make sure you're in the directory with the `setup.py` file.
- type: `pip install .`
 - the dot tells pip to look for the setup file in the current folder. 
 
After that the package is installed and can be imported via 

`from distributions import Gaussian`

#### Where did pip install the package to?

Wherever pip installs packages on your system. 

Use `distributions.__file__` to check where the packages has been installed to.

If you decide to install this package on your local computer, you might want to create something called a virtual environment first. 

like a silo for python installing Python packages. 

#### What is pip?

- a [python package manager](https://pip.pypa.io/en/stable/)
- When you execture a comman like `pip install numpy`, pip will download the package from a Python package repository called [PyPi](https://pypi.org/). 

If you want to develop a package locally on your computer, you should consider setting up a virtual environment. That way you if install your package on your computer, the package won't install into your main Python installation. Before starting the next exercise, the next part of the lesson will discuss what virtual environments are and howto use them.

## Virtual Environments

If you decide to install your package on your local computer, you'll want to create a virtual environment. 

> A virtual environment is a silo-ed Python installation apart from your main Python installation. 

Two different environment managers: 

- conda ([link](https://conda.io/docs/))
- venv

#### conda

- manages packages, i.e. `conda install numpy`
- manages environments 

With an environment manager, you can install packages on your computer without affecting your main Python installation. 

```
conda create --name environmentname
source activate environmentname
conda install numpy
```

#### Pip and Venv

- venv is a pre-installed environment manager
- pip is a package manager. pip can only manage pyton packages
- conda is language agnostic and was invented because pip could not handle data science packages outside of python
 - conda manages environments AND packages. 

To use venv and pip, the commands look something like this:

```
python3 -m venv environmentname
source environmentname/bin/activate
pip install numpy
```

#### Which to choose?

- ...

If you create a conda environment, and then pip install the distributions package, you'll find the system that installs your package [globally rather than in your local conda environment](https://github.com/ContinuumIO/anaconda-issues/issues/1429).

If you create conda environment and install pip simultaneously, you'll find that pip behaves as expecter installing packages into your local environment: 

`conda create --name environmentname pip`

- usig pip with venv works as expected. 

Pip and venv tend to be uects sed for generic software development projects including web development. 

- venv is what is recommended for this project:

#### How to use venv?

- virtual environment package should already be installed. 
- [installing using pip and venv] (https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)



- cd to where you want to open the virtual environment.
- type: `python -m venv env_name`
- folder `venv_name` appears 
- activate (on windows) using: `.\env_name\Scripts\activate`
- anything that you install now will be installed into your virtual environment instead of your default Python installation. 
- deactivate using: `deactivate`

For now, keep venv activated: 

- cd to where the package is:
- `cd python_package_example`
- `pip install .`

You can always just delete the virtual environment folder from your computer and there shouldn't be any real consequences.

Start using the package 

```
from distributions import Gaussian

gaussian_one = Gaussian (25, 2)
gaussian_two = Gaussian (30, 4)
gaussian_one.mean
gaussian_one + gaussian_two
```

## Binomial Class

- Adding Binomial Class to your distributions!

Create a new package in the folder `binomial package`

- Change `__init__.py` to account for the binomial class. 
- Work through `Binomialdistribution.py` where building the binomial class is guided through. 
- `test.py` contains unit tests that guide in assessing whether your code is working properly
- After changes to the package the package requires to be reinstalled (make sure you are in the package directory): 
- Upgrade: `pip install --upgrade .`

Note that there is also a `Binomialdistribution_challenge.py` file which is a little bit more demanding as there is no code given at all.

> Task: install the distributions package in a virtual environment and use it in the command line

```
from distributions import Gaussian, Binomial
Gaussian(10, 7)
Binomial(.4, 25)
```

If everything works this should output the values for both distributions.

## Scikit-learn Source Code

- scikit-learn as example of OOP
- [source code](https://github.com/scikit-learn/scikit-learn)

#### Contributing to a GitHub Project

- [Beginner's Guide to Contributing to a Github Project](https://akrabat.com/the-beginners-guide-to-contributing-to-a-github-project/)
- [Contributing to a Github Project](https://github.com/MarcDiethelm/contributing/blob/master/README.md)

#### Advanced Python OOP Topics

- [Decorators](https://realpython.com/primer-on-python-decorators/)
- [Mixins](https://easyaspython.com/mixins-for-fun-and-profit-cb9962760556)


## Putting Code on PyPi


- Follow this video [Putting Code on PyPi](https://www.youtube.com/watch?time_continue=2&v=4uosDOKn5LI)

There are two different websites called

- [pypi.org](pypi.org)
- [test.pypi.org](https://test.pypi.org/)

Test first on test website. 

File structure for uploading package `binomial_package_files`

```
├── dsnd_probability
│    └── Gaussiandistribution.py
│    └── Generaldistribution.py
│    └── Binomialdistribution.py
│    └── README.md
│    └── license.txt
│    └── __init__.py
│    └── setup.cfg
│
└── setup.py
```

- `license.txt`: Copyright information; used language for MIT license (copy/pasted)
- `README.md`: Document how the package works
- `setup.cfg`: Data on readme file
- Name of the package is "dsnd_probability"
 - same name as for the folder (necessary or convention?)
 - Every package on PyPi needs a unique name. 
 - packages variable: use the same name that you put in the name variable.
 - `zip_safe=False` means the package can't be run directly from a zip file.
 
 
#### Terminal Commands

- cd to `binomial_package_files`
- type: `python setup.py sdist`

*A couple of new folders appeared* 

Inside the `dist` folder is a `...tar.gz` file

- This is the file that you are going to end up uploading to the PyPi repository. 
- Next install the twine package: `pip install twine`
- upload package to the test repo using: `twine upload --repository-url https://test.pypi.org/legacy/ dist/*`

*Check in your testPyPi account if it was uploaded...*

- pip install the package: `pip install --index-url  https://test.pypi.org/simple/ dsnd-probability`

```
# command to upload to the pypi repository
twine upload dist/*
pip install dsnd-probability
```

After uploading the package to the official Pypi website the package can be installed directly using pip. 

## More PyPi Resources

- Tutorial on distributing Python packages, including more configuration options for setup.py file: [Tutorial](https://packaging.python.org/tutorials/distributing-packages/)

Python command to run the setup.py is slightly different with: 
```
python3 setup.py sdist bdist_wheel
```

 - This command will still output a folder called `dist`. Difference is that you will get both a `.tar.gz` and a `.whl` file. 
  - `tar.gz` file is called a source archive
  - `.whl` file is a built distribution. This is a newer type of installation file for Python packages. 
  
When you pip install a package, pip will first look for a whl file (wheel file) and if there isn't one, then look for the `tar.gz` file. 

- A `tar.gz`: contains files to [compile](https://en.wikipedia.org/wiki/Compiler) and install a Python package.
- A `whl` file, i.e. a built distribution, only needs to be copied to the proper place for installation. Behind the scenes, pip installing a whl file has fewer steps than a tar.gz file. 

#### Other links

- [Overview of PyPi](https://docs.python.org/3/distutils/packageindex.html)
- [MIT License](https://opensource.org/licenses/MIT)

## Exercise: Upload to PyPi

Create the following files: 

- setup.cfg
- README.md
- license.txt

Create accounts for 

- pypi test repository
- pypi repository
- Don't forget to keep your passwords

Note that the package name needs to be unique

- Change the name of the package from `distributions` to something else
- Accordingly, change the information in `setup.py` and the folder name

## Lesson Summary: OOP

- classes vs objects
- Methods and Attributes
- Magic Methods, Inheritance
- Python Package
- PyPi

In [1]:
print("Done!")

Done!
