# Various Topics

# Virtual Environments

## Plain Old Python

Virtual environments (VEs) allow you to create functionally distinct installations of Python.

This allows you to run code that has specific dependencies.

It also allows you to share code environmens for various purposes.

There are at least two ways to create VEs &mdash; with Python and with Conda.

## With Python

```bash
# Create by calling the `venv` module
python -m venv my_env

# Use the shell command `source` to run the `activate` script
source my_env/bin/activate

# Install packages to the environment by calling the argument
python -m pip install my_env

# Get out of the environment
deactivate
```

When an enviroment is activated, you will see the environment name in parentheses prefixed to your command line prompt:

```bash
# The default environment
(base) rca2t:

# Inside my_env
(my_env) rca2t:
```

## With Conda

Conda provides a less complicated way to create and manage environments.

To create a new enviroment:

```bash
# Create an environment with defaults
conda create -n my_env 

# Create an environment with a specific version of Python
conda create -n my_env python=3.8

# Create an environment with a specific package
conda create -n my_env scipy

# Create an environment with a specific package and version
conda create -n my_env scipy=0.17.3

# Create an environment with a multiple parameters
conda create -n myenv python=3.9 scipy=0.17.3 astroid babel

# Activate the environment
conda activate env_name

# Deactivate the environment
conda deactivate
```

Conda offers several commands to manage environments:

```bash
usage: conda-env [-h] command ...

positional arguments:
  command
    create    Create an environment based on an environment definition file. If using an environment.yml file (the
              default), you can name the environment in the first line of the file with 'name: envname' or you can
              specify the environment name in the CLI command using the -n/--name argument. The name specified in the
              CLI will override the name specified in the environment.yml file. Unless you are in the directory
              containing the environment definition file, use -f to specify the file path of the environment
              definition file you want to use.
    export    Export a given environment
    list      List the Conda environments
    remove    Remove an environment
    update    Update the current environment based on environment file
    config    Configure a conda environment
```

To see what environments you have installed, do this:

```bash
rca2t@rivanna$ conda env list
# conda environments:
#
base                  *  /apps/software/standard/core/anaconda/2020.11-py3.8
eta                      /home/rca2t/.conda/envs/eta
```

# Create from a YAML file

```bash
conda env create -f environment.yml
```

A simple YAML file:

```yaml
name: stats
dependencies:
  - numpy
  - pandas
```

The first line of the yml file sets the new environment's name.

A more complex version:

```yaml
name: stats2
channels:
  - javascript
dependencies:
  - python=3.9
  - bokeh=2.4.2
  - conda-forge::numpy=1.21.*
  - nodejs=16.13.*
  - flask
  - pip
  - pip:
    - Flask-Testing
```


See the Conda docs on [Managing Environments](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html).

# Dunder Methods

Python classes have a number of implicit so-called dunder methods.

For example, `__init__()` is a dunder method.

So is `__eq__()` ... which means you can define how the `+` operator works with the objects of your class!

Here some others:

- `__new__()`: Creates a new instance of a class.

- `__repr__()`: Returns the string representation of the class.

- `__str__()`: Returns a reader-friendly string representation of a class object.

- `__del__()`: Destroy object.

- `__lt__()`, `__le__()`, `__eq__()`, `__ne__()`, `__gt__()`, `__ge__()`


For more info, see [the Python docs](https://docs.python.org/3/reference/datamodel.html) and this [nice intro](https://mathspp.com/blog/pydonts/dunder-methods).


# Decorators

Decorators are constructs that do things with functions.

Essentially, they are functions of functions (or classes).

Syntactically, they appear lines prefixed with `@` right above a function definition.

For example, the web application development tool Flask uses decorators to map functions to URLs.

```python
@app.route('/test')
def test():
    return render_template('test.html')
```

Another example is the data class (`dataclasses`) module, introduced in Python 3.7.

A data class is a class typically containing mainly data, although it does not have to be. 

It is created using the new `@dataclass` decorator, as follows:

```python
from dataclasses import dataclass

@dataclass
class DataClassCard:
    rank: str
    suit: str
```

Doing this will automatically create a class basic functionality already implemented. 

You can instantiate, print, and compare data class instances straight out of the box:

```python
queen_of_hearts = DataClassCard('Q', 'Hearts')
queen_of_hearts.rank
```

Compare that to a regular class:

```python
class RegularCard:
    def __init__(self, rank, suit):
        self.rank = rank
        self.suit = suit
```

## Another Example

[Source](https://www.infoworld.com/article/3563878/how-to-use-python-dataclasses.html)

```python
class Book:
    '''Object for tracking physical books in a collection.'''
    def __init__(self, name: str, weight: float, shelf_id:int = 0):
        self.name = name
        self.weight = weight # in grams, for calculating shipping
        self.shelf_id = shelf_id
    def __repr__(self):
        return(f"Book(name={self.name!r},
            weight={self.weight!r}, shelf_id={self.shelf_id!r})")
```

Notice the way each of the arguments passed to `__init__` has to be copied to the object’s properties. 

Here is the same class implemented as a dataclass:

```python
from dataclasses import dataclass

@dataclass
class Book:
    '''Object for tracking physical books in a collection.'''
    name: str
    weight: float 
    shelf_id: int = 0
```

Note the typing and initialization.

`@dataclass` automatically creates code for a number of common dunder methods in the class. 

For example, in the conventional class above, we had to create our own `__repr__`. 

The `@dataclass` decorator generates the `__repr__` for you.

For info in `__repr__`, see [this helpful tutorial](https://www.pythontutorial.net/python-oop/python-__repr__/)


**Advanced Python dataclass initialization**

The dataclass decorator can take initialization options of its own. 

Most of the time you won't need to supply them, but they can come in handy for certain edge cases. 

Here are some of the most useful ones (they're all True/False):


- `frozen`: Generates class instances that are read-only. Once data has been assigned, it can't be modified.
- `slots`: Allows instances of dataclasses to use less memory by only allowing fields explicitly defined in the class.
- `kw_only`: When set, all fields for the class are keyword-only.

```python
@dataclass(init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False,
           match_args=True, kw_only=False, slots=False, weakref_slot=False)
```

**Customize Python dataclass fields with the field function**

Use the `field()` function for fine-tuning:

```python
from dataclasses import dataclass, field
from typing import List

@dataclass
class Book:
    '''Object for tracking physical books in a collection.'''
    name: str     
    condition: str = field(compare=False)    
    weight: float = field(default=0.0, repr=False)
    shelf_id: int = 0
    chapters: List[str] = field(default_factory=list)
```

When you set a default value to an instance of field, it changes how the field is set up depending on what parameters you give field. 

These are the most commonly used options for field (there are others):

- `default`: Sets the default value for the field. 
- `default_factory`: Provides the name of a function, which takes no parameters, that returns some object to serve as the default value for the field. 
- `repr`: By default (`True`), controls if the field shows up in the automatically generated `__repr__` for the dataclass. Here we don’t want the book’s weight shown in the `__repr__`, so we use repr=False to omit it.
`compare`: By default (True), includes the field in the comparison methods automatically generated for the dataclass. Here, we don’t want condition to be used as part of the comparison for two books, so we set compare=False.

Note that the order of the fields matters &mdash; the non-default fields come first.

**Controlling Python dataclass initialization**

To get control over the init process, use `__post_init__`.

```python
from dataclasses import dataclass, field
from typing import List

@dataclass
class Book:
    '''Object for tracking physical books in a collection.'''
    name: str    
    weight: float = field(default=0.0, repr=False)
    shelf_id: Optional[int] = field(init=False)
    chapters: List[str] = field(default_factory=list)
    condition: str = field(default="Good", compare=False)

    def __post_init__(self):
        if self.condition == "Discarded":
            self.shelf_id = None
        else:
            self.shelf_id = 0
```

The `__post_init__` method sets `shelf_id` to `None` if the book’s condition is initialized as "Discarded". 

Note how we use field to initialize `shelf_id`, and pass `init` as `False` to field. 

This means `shelf_id` won’t be initialized in `__init__`.

**`InitVar`**

The `InitVar` type lets you specify a field that will be passed to `__init__` and then to `__post_init__`, but won’t be stored in the class instance.

This allows you to take in parameters when setting up the dataclass that are only used during initialization. 

```python
from dataclasses import dataclass, field, InitVar
from typing import List

@dataclass
class Book:
    '''Object for tracking physical books in a collection.'''
    name: str     
    condition: InitVar[str] = "Good"
    weight: float = field(default=0.0, repr=False)
    shelf_id: int = field(init=False)
    chapters: List[str] = field(default_factory=list)

    def __post_init__(self, condition):
        if condition == "Unacceptable":
            self.shelf_id = None
        else:
            self.shelf_id = 0
```

Setting a field’s type to `InitVar` signals to `@dataclass` to not make that field into a dataclass field, but to pass the data along to __post_init__ as an argument.

For more info, check out [the Python docs on the subject](https://docs.python.org/3.11/library/dataclasses.html).

# Comparing R and Python

See [this resource](https://www.mit.edu/~amidi/teaching/data-science-tools/conversion-guide/r-python-data-manipulation/) from MIT.

# Getting in Reps

[W3Schools](https://www.w3schools.com/python) is a great site to develop your skills in Python codings.

Try these [exercises](https://www.w3schools.com/python/python_exercises.asp) to reinforce what you've learned in this course.

# Experiment

In [43]:
from dataclasses import dataclass

In [59]:
@dataclass
class Test:
    x:int = 0
    y = []

In [52]:
t1 = Test()
t2 = Test()

In [53]:
Test.x, t1.x, t2.x

(0, 0, 0)

In [54]:
Test.x = 1

In [55]:
Test.x, t1.x, t2.x

(1, 0, 0)

But the mutables remain a problem:

In [56]:
t1.y.append(1)

In [57]:
Test.y, t1.y, t2.y

([1], [1], [1])