# Python and its Scientific Modules
---
Questions:
- "How do I install and use Python modules?"
- "How do I interact with Python code?"
- "What are some popular scientific Python packages?"

Objectives:
- "Ensure students can access a Linux terminal and have basic requirements installed."
- "Understand how Python packages are distributed and the benefits of environment management."
- "Understand how to import Python packages or call them via the command line interface."

---

### Instructor notes (remove before merging to CAMLC24)

I think we can use the Molecular Sciences Software Institute's free courses on [Python Scripting for Computational Molecular Science](https://education.molssi.org/python_scripting_cms/01-introduction/index.html) and [Python Package Best Practives](https://education.molssi.org/python-package-best-practices/) to build a lot of this material

**IIA**: Maybe some of the code boxes I ahve included could be executable boxes so the students can actually execute them and see the output? instead of 'spoiling' the result? 

### Python Basics

In [14]:
# It is best practice to import all modules at the top of your file

# It is also best practice to be liberal with comments, so if you come 
#  back to the code or someone else tries to read your code it is more 
#  easily understood.

# To make the code easier to look at the sentence above was separated into new 
#  new lines. How you decide to format things like that is personal preference.

import numpy #https://numpy.org
import matplotlib #https://matplotlib.org
import rdkit #http://www.rdkit.org/docs/index.html
import pandas #https://pandas.pydata.org/docs/getting_started/index.html

If you receive a "ModuleNotFoundError" message, try executing in a new cell 
> pip install \\$module 

where \\$module is the name of the module not found.

#### Python Packages and Environment Management

There are a variety of ways to install packages. You can install the source code from the associated website, or from a GitHub repository, or from a distribution platform. PIP is an example of a Python package manager and distribution platform. You can peruse the Python packages available for installation via PIP on the [Python Package Index (PyPI) distribution platform](https://pypi.org).

[Anaconda](https://anaconda.org) (also see [miniconda](https://docs.anaconda.com/free/miniconda/)) is also a package manager, but it is much more than that. Besides installing and updating packages Anaconda also creates and manages virtual environments. You can think of a virtual environment as a new computer, independent of any other environment on the computer. This becomes very important if a package you want to use requires a specific version of another package (for example, maybe it requires a Python version < 3.0), but you have other packages that require other versions of Python. You can use the conda create command to create a virtual environment that is tailored to the dependency requirements of specific packages or systems. If the package has been distributed on Anaconda you will be able to find conda installation instructions for it. Or you can also use pip to install the package- just make sure you are using the pip assigned to your conda environment (this can be checked by running which pip and making sure it is located in the path of your conda environment). Lastly, Anaconda supports management for non-Python packages, so it is much broader that pip/PyPi which is only Python.

When working on shared computing resources it is often required to use virtual environmets by using conda or venv. [venv](https://docs.python.org/3/library/venv.html) is solely a virtual environment manager and can be used similarly to Anaconda, however, it is only for managing Python packages. 

#### Basic Syntax

##### Variables, print values and basic operations

We can assign different types of values to a variable like your name (*float*), age (*integer*) or your height (*float*).

```python
name = 'Pepito Jimenez'
age = 53
height = 1.73
```

We can also do any type of mathematical operation with these variables and also show the results of these operations (`print($variable)`).

```python
pizza = 12.7
white_wine = 27.8
cheesecake = 5.3

total_price = pizza + white_wine + cheesecake

print(f'Total price of dinner is: {total_price}')
```
```output
Total price of dinner is: 45.8
```

There are also other types of data that you can store in variables
- **list** 
    - *Mutable array of data*, 
    - Defined in between `[]`
- **tuples**
    - *Immutable array of data* 
    - Defined in between `()`
- **dictionaries** 
    - *Array of key:value pairs*
    - Defined in between `{}`

```python
list_example = ['Pepito Jimenez', 'John Smith', 'Patricia Summers']                 # List of names
tuple_example = ('Pepito Jimenez', 'John Smith', 'Patricia Summers')                # Tuple of names
dictionary_example = {'Pepito Jimenez':53, 'John Smith':27, 'Patricia Summers':32}  # Dictionary of names and their corresponding age


##### For loops

For loops are used to do something a *N* number of times or iterate array variables aforementioned to perform an action for every element on that array.

A basic *for loop* would be priting all the names in the list above:

```python
for name in list_example:
    print(name)
```

```output
Pepito Jimenez
John Smith
Patricia Summers
```

As a more scientific useful example, this type of loops can also be used to convert the units of a list of energy values.

```python
energies_kcal = [3.24, 6.12, 9.65]          # List of energies in kcal
energies_kj = []                            # Empty list of energies in kJ

for energy in energies_kcal:
    conversion_factor = 4.184               # Conversion factor from kcal to kj
    energy_kj = energy * conversion_factor
    energies_kj.append(energy_kj)           # Append the converted energy to the kJ list of energies

print(energies_kj)
```

```output
[13.556160000000002, 25.606080000000002, 40.375600000000006]
```

##### Logc statements

Logic statements are used to compare different pairs of data and perform and action only if the statement is `True`.
Several types of logic operations can be performed:
- Equal to `==` checks if two values are identical
- Not equal to `!=` checks if two values are different
- Greater than `>`
- Less than `<`
- Greater or equal `>=`
- Less or equal `<=`

Several of these statements can be combined to create more complex conditions by using the following formulas 
- `and` checks that both conditions are matched
- `or` checks that one or the other statements are **True**
- `not` checks that the first contidion is matched but the later is not.

The basic syntax of these statemets is
```python
if variable == value:
    # Code
else:
    # Code
```

Using the variables we created above to define Pepito Jimenez we can check several conditinos. 

```python
name = 'Pepito Jimenez'
age = 53
height = 1.73

if name == 'Pepito Jimenez':
    print('This person is Pepito Jimenez')
else:
    print('This person is not Pepito Jimenez')
```

We can write a more complex statement to check, for example, if he is older than 50 and shorter than 1.80m

```python
if age > 50 and height < 1.80:
    print('The person is older than 50 and shorter than 1.80m)
else:
    print('This person is either younger than 50, taller than 1.80m or both.)
```

Combining these statements with *for loops* we can check a certain condition for all the values in an array:

```python
for energy in energies_kJ:
    if energy > 20:
        print(energy)
    else:
        print('<20')
```

```output
<20
25.606080000000002 
40.375600000000006
```

##### Functions

Sometimes, certain pieces of code are very common and repeated several times in our scripts. 
A good example of this would be converting units between units like we did before.
For this situations, in order to save time and space, we can define a functino that will execute this action every time we want.

The structure of a function definition is:
```python
def function_name(args):
    # Code

    return value
```

In the energy convertion example, we can define a function that takes a energy value in kcal as an argument and returns the same energy in kJ.

```python
def kcal_to_kj(energy):
    conversion_factor = 4.184               # Conversion factor from kcal to kj
    energy_kj = energy * conversion_factor

    reutnr energy_kj
```

Using this function we can rewrite the code above as:
```python
for energy in energies_kcal:
    energies_kj.append(kcal_to_kj(energy))           # Append the converted energy to the kJ list of energies

print(energies_kj)
```

Functions are very important to keep our code as clean and organised as possible and require extensive commenting and documentation to explain their functionality and make them readible for future developers (and future versions of ourselves).