![IE](../img/ie.png)

# Session 2: The Python execution model

### Juan Luis Cano Rodríguez <jcano@faculty.ie.edu> - Master in Business Analytics and Big Data (2020-05-07)

### How does `import` work?

How do `import os` and `import pandas` work? They are both in `sys.path`:

In [1]:
import os

In [2]:
import pandas

### How can I `import` my code?

Before answering that question, let's create a small project, called "IE-NLP-Utils", where we will place some basic functios we will use throughout the course.

1. Go to the command line
2. Browse to a directory of your liking, for example `cd ~/Projects/IE`
3. Create a new directory, for example `mkdir ie-nlp-utils`
4. Enter that directory, `cd ie-nlp-utils`
5. Let's create a basic `README.md` containing the name of the project and your name
6. Let's generate a `.gitignore` file from https://gitignore.io/ for "Python" and "Jupyter notebooks", and copy the contents in the same directory as the `README.md`
7. `git add` the two new files, and `git commit` with the message `"First commit"`

#### Exercise

Now we have some basic structure to start a Python project (we will see how to refine this in another session). Now we are going to create some basic code that we can use.

1. Create a `ie_nlp_utils.py` file with a function called `tokenize` that takes a `str` sentence and splits it into a `list` of words
2. Open a Python interpreter (`winpty python` on Git Bash on Windows, `python` everywhere else) and check that `from ie_nlp_utils import tokenize` works
3. Test the function by calling it with some sentence

### The `PYTHONPATH`

However, importing our code only works from the same directory:

```
$ ls
ie_nlp_utils.py README.md
$ cd ..
$ ls
ie-nlp-utils
$ python3
>>> import math  # Still works
>>> import ie_nlp_utils
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'ie_nlp_utils'
```

Why? Python looks in some predefined locations to know where to find what we want to import, called the "PATH":

```
>>> import sys
>>> sys.path
['', '/usr/lib/python36.zip', '/usr/lib/python3.6', '/usr/lib/python3.6/lib-dynload', '/usr/local/lib/python3.6/dist-packages', '/usr/lib/python3/dist-packages']
```

Therefore, there are two ways of making our code **globally importable**:

1. Modify the "PATH"
2. Put our code inside a location predefined in the "PATH"

The first option can be achieved like this:

```
>>> sys.path.insert(0, "/home/juanlu/test_project")
>>> import model  # Works!
Hello, world!
>>>
```

Or, alternatively, from outside of the interpreter:

```
$ export PYTHONPATH=/home/juanlu/Projects/IE/ie_nlp_utils
$ python3
>>> import sys
>>> sys.path  # Notice the change!
['', '/home/juanlu/Projects/IE/ie_nlp_utils', '/usr/lib/python36.zip', '/usr/lib/python3.6', '/usr/lib/python3.6/lib-dynload', '/usr/local/lib/python3.6/dist-packages', '/usr/lib/python3/dist-packages']
>>> import ie_nlp_utils  # Now it works!
>>>
```

However, **both are bad practices and should be avoided**. In future sessions we will see [the right way to distribute Python code](https://packaging.python.org/tutorials/packaging-projects/).

### What does `import` do?

Python code is normally written in `.py` scripts. For example:

```
$ cat model.py
print("Hello, world!")
```

These scripts can be imported in the same way that any model or package from the [standard library](https://docs.python.org/3/library/index.html) can:

```
$ python3
>>> import math  # Works, because it's in stdlib
>>> import numpy as np  # Works if you `pip install numpy`'ed in advance
>>> import model  # Works if you are in the same directory
Hello, world!
>>> 
```

When the user imports a script, **python runs the script**. That's the way all the possible functions and classes inside it are available.