# Modules

In Python, a module is a file that contains Python code, and it typically includes functions, classes, and variables that can be reused across different programs.

Modules are a way to organize code logically and promote code reusability.

Instead of writing the same code over and over, you can store commonly used functions in a module and import it wherever needed.




## Creating a Module
Any Python file with the `.py` extension is considered a module.

For example, if you create a file called `math_operations.py` containing functions like `add`, `subtract`, `multiply`, etc., this file is now a module that can be imported into other Python scripts.

```python
# math_operations.py

def add(a, b):
    return a + b

def subtract(a, b):
    return a - b
```


## Importing a Module
To use the functions and variables from a module in another file, you import the module using the import keyword.

```python
# main.py

import math_operations

result = math_operations.add(5, 3)
print(result)  # Output: 8
```

You can also import specific functions or variables from a module:

```python

from math_operations import add

result = add(5, 3)
print(result)  # Output: 8
```

The following code demonstrates how to import and use a function from Python's built-in `math` module.

In [None]:
# You can import modules
import math

# Using a function from the math module
sqrt_value = math.sqrt(16)
print("Square Root of 16:", sqrt_value)

Instead of importing the entire `math` module, this code only imports the `ceil` and `floor` functions from math.

By importing specific functions, we save memory and make the code slightly more efficient since Python doesn’t load the entire math module.

After this import, you can call `ceil` and `floor` directly, without needing the math. prefix.

In [None]:
# You can get specific functions from a module
from math import ceil, floor
print(ceil(3.7))   # => 4
print(floor(3.7))  # => 3

This code demonstrates how to import all functions, classes, and variables from a module using the `*` symbol (often called a "wildcard import")

> **Namespace Pollution**: When you import everything from a module, all of its functions and variables are added to your script's namespace. This can lead to conflicts if the imported names overlap with any names already defined in your code.

For example, if you had a function called `log` in your script and you used `from math import *`, Python would use math.log instead of your own log function without any warning, which could lead to hard-to-find bugs.

In [None]:
# You can import all functions from a module.
# Warning: this is not recommended
from math import *

The `import math as m` line imports the math module but assigns it the alias `m`.
This means that everywhere you would typically use math, you can now use `m` instead.
Using an alias can make your code cleaner, especially if a module has a long name or if you’re using it frequently.


In [None]:
# You can shorten module names
import math as m
math.sqrt(16) == m.sqrt(16)  # => True

>  For example, `import numpy as np` and `import pandas as pd` are standard conventions for the numpy and pandas libraries.

## Types of Modules

Python modules can be of several types:

- Built-in Modules: Python includes many standard modules, such as math, os, sys, datetime, etc., that provide useful functions and classes for common tasks.
- User-Defined Modules: These are modules created by users, like math_operations.py in the example above.
- Third-Party Modules: These are modules provided by third-party developers, often available through the Python Package Index (PyPI) and can be installed using pip. Examples include numpy, pandas, and requests.


### Built-in Modules
These are modules included with Python, so you don’t need to install them. They provide common functionality like math operations, file handling, and working with dates.

Example: Using the random Built-in Module


In [None]:
import random

# Generate a random integer between 1 and 10
random_number = random.randint(1, 10)
print("Random number:", random_number)

### User-Defined Modules
These are modules created by the user. Any .py file you write can be a module that you import into other Python files.

Suppose you create a file called greetings.py with the following code:
```python
# greetings.py

def say_hello(name):
    return f"Hello, {name}!"
```

Then, you can use this module in another file as follows:

```python
# main.py

import greetings

# Use the function from the greetings module
message = greetings.say_hello("Alice")
print(message)  # Output: Hello, Alice!
```

### Third-Party Modules
These modules are created by other developers and are available via the Python Package Index (PyPI).

You can install them with `pip`.

Examples include `requests` for making HTTP requests, `numpy` for numerical operations, and `pandas` for data manipulation.



In [None]:
!pip install requests

In [None]:
import requests

# Fetch data from an API
response = requests.get("https://jsonplaceholder.typicode.com/todos/1")

# Check if the request was successful and print the JSON data
if response.status_code == 200:
    print(response.json())


## Priority between modules

Python modules are just ordinary Python files. You can write your own, and import them. The name of the module is the same as the name of the file.
You can find out which functions and attributes are defined in a module using `dir(module_name)`
>     import math
>     dir(math)

If you have a Python script named math.py in the same folder as your current script, the file math.py will be loaded instead of the built-in Python module.
This happens because the local folder has priority over Python's built-in libraries.

In [None]:
import math
dir(math)

## Accessing Module Contents
Modules can contain functions, classes, and variables. Once imported, you can access these by using the module_name.attribute syntax.

```python
import math

result = math.sqrt(16)  # Using a function from the math module
print(result)  # Output: 4.0
```


## The `__name__` Variable

Every module has a special built-in variable called `__name__`. When a module is run directly, `__name__` is set to `"__main__"`. However, when a module is imported into another script, `__name__` is set to the module's filename. This allows you to control parts of the module’s code execution.

```python
# math_operations.py

def add(a, b):
    return a + b

if __name__ == "__main__":
    print("Running module directly")
    print(add(5, 3))  # Output: 8
```


## Built-in Functions for Modules
Python offers built-in functions for handling modules, such as:

- `dir(module)`: Lists all functions and attributes of a module.
- `help(module)`: Provides documentation about the module.
Using these tools, you can explore the functionalities and documentation of both built-in and third-party modules.



# Packages

Organizing Multiple Modules
When you have a collection of modules organized into a folder, it is called a package. Packages make it easy to manage large projects by organizing modules into directories. Each package typically includes an __init__.py file, which can be empty or contain package-level code.

For example:





A package contains a special `__init__.py` file, which indicates to Python that the directory should be treated as a package.
- Packages enable a more organized and hierarchical structuring of your code, especially useful for large projects with many modules.

Example: If you have a directory structure like this
```css
my_project/
│
├── math_operations/
│   ├── __init__.py
│   ├── addition.py
│   └── subtraction.py
└── main.py
```
Then, in main.py, you can import these modules as follows:

```python
from math_operations import addition, subtraction
```

## Safe packages

**How to know when a python package can be used?**

- Being able to find it in Python Package Index (PyPI) is a good indicator that the package is legit. https://pypi.org/
- Check dependencies in https://libraries.io/
- Does the package support the Python version that you’re working with?
- How popular is the package?
- Is the package’s codebase well maintained?
- Do other packages rely on the package?
- Does the package’s license fit your needs?
- What’s the exact pip install command for the package?

Even when you follow the good practice of working with a virtual environment, Python packages can access other parts of your operating system outside your project’s folder.

Evildoers may upload packages where they’ve switched two letters or replaced one with a neighboring letter on the keyboard. This imitation technique is known as **typosquatting**. Some packages can be considered malware and shouldn’t find their way onto your system.


## How to know the version of a package?

While `__version__` is not mandatory, it’s a best practice that enhances usability, compatibility, and standardization for both personal projects and widely-used libraries.

This is useful for checking which version you are working with, especially when different versions of the packages may introduce changes or new features.




In [None]:
import pandas as pd

print(pd.__version__)


In [None]:
import numpy as np

print(np.__version__)

## How to generate a list of all installed Python packages?

You can use either the command `pip list` or `pip freeze`.
The difference between both lies in their output format and intended use.
- `pip list` output is in a human-readable format
- `pip freeze` output is suitable for generating requirements file.

Open your terminal or command prompt and type
```bash
pip freeze
```
If you do it on Colab or in Jupyter Notebook, you can run it in a code cell using the ! character in the beginning to indicate that it´s a command.
```python
!pip freeze
```
In both cases you´ll get a list of packages that are installed and the versions of each of them.

This list is important because:
- It helps you see what packages are installed, which is crucial for managing dependencies in a project
- It ensures that you can keep track of package versions, which helps maintain compatibility and avoid issues when sharing code or deploying applications

> If you copy the list to a text file called `requirements.txt` it will allow you to easily recreate the environment on another system or share it with others

For production pipelines dependency management and packaging tool like Poetry is highly recommended because it simplifies the project.

More information about [Poetry](https://python-poetry.org/docs/)

## How to install a python package from PyPI?

To install a Python package from PyPI (Python Package Index), you can use the pip command in your terminal or command prompt.

1. Install the Latest Version

  ```bash
  pip install package_name
  ```
  The `-U` flag in pip install stands for "upgrade." When you use this flag, it instructs pip to upgrade the specified package to the latest version available on PyPI, even if a version is already installed.
  ```bash
  pip install -U package_name
  ```


2. Install a Specific Version

  To install a specific version of a package, specify the version number:

  ```bash
  pip install package_name==version_number
  ```
3. Install At Least a Given Version

  To install a package that is at least a certain version, use the greater-than-or-equal-to operator (>=):

  ```bash
  pip install package_name>=version_number
  ```
4. Install a Stable Version

  To avoid unexpected behavior caused by newer versions of a package, this ensure that you use a stable, tested version.
  ```bash
  pip install package_name<=version_number
  ```

#🏆 Packages Challenge

**Exploratory Data Analysis on NYC For-Hire Vehicle Dataset**

In recent years, the New York City transportation landscape has undergone significant changes, particularly with the rise of for-hire vehicle (FHV) services. Understanding the patterns and trends in FHV usage is crucial for city planners, policymakers, and transportation companies aiming to improve urban mobility and enhance service delivery.

As part of a data analysis initiative, we are tasked with performing an exploratory data analysis (EDA) on the For-Hire Vehicles dataset provided by the City of New York. This analysis aims to uncover insights about FHV usage, identify trends, and help stakeholders make informed decisions regarding transportation policies.

**Objective**

The primary objective of this challenge is to download the For-Hire Vehicles dataset from [NYC's Open Data portal](https://data.cityofnewyork.us/Transportation/For-Hire-Vehicles-FHV-Active/8wbx-tsch/about_data), perform preliminary data cleaning, and generate a comprehensive EDA report using the [`ydata-profiling` library](https://pypi.org/project/ydata-profiling/).

This report will serve as a foundational analysis to guide further research and decision-making.

