<a href="https://colab.research.google.com/github/rii92/Datmin/blob/main/bootcamp/session2/05_modules_and_packages.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Modules and Packages

Modules and packages are useful for organization of Python code making our code clean, modular and easy to maintain.

## Python Modules

In Python, a module is a file containing Python code, which can include functions, classes, or variables. Modules help us to organize and reuse our code effectively.

>Note: For the following examples, you need to create manually `mod.py` and the new file using your own code editor, simply copy and paste the code into two separate files.

## Create a Module

To create a module we simply write our Python code into a file and give it a name with `.py` extension, in this case we shall name it `mod.py`.

```python
#mod.py
s = "I'm learning Python in order to become an AI engineer."
a = [100, 200, 300]

def foo(arg):
    print(f'arg = {arg}')
```

In [1]:
%%writefile mod.py

s = "I'm learning Python in order to become an AI engineer."
a = [100, 200, 300]

def subtraction(x, y):
  return x - y

Writing mod.py


## Using Modules

To use a module, we use the keyword `import`. After importing a module, you can use its functions and classes by using the module name followed by a dot, and then the function or class you want to use.

```python
import mod
print(mod.s)

mod.a

mod.foo(['quux', 'corge', 'grault'])
```

In [2]:
import mod

In [3]:
mod.a

[100, 200, 300]

In [4]:
mod.s

"I'm learning Python in order to become an AI engineer."

In [5]:
mod.subtraction(10, 5)

5

In [6]:
from mod import subtraction

In [7]:
subtraction(10, 8)

2

## Package Management with pip


`pip` is the package manager for Python, and it allows you to easily install, update, and remove packages. Packages are collections of modules that provide additional functionality. You can download packages from the Python Package Index (PyPI).

### Basic commands

Here is a list of basic commands for package management with pip:

- To install a package: `pip install package_name`
- To update a package: `pip install --upgrade package_name`
- To remove a package: `pip uninstall package_name`
- To list installed packages: `pip list`

In [8]:
!pip install openai

Collecting openai
  Downloading openai-1.35.15-py3-none-any.whl (328 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m328.6/328.6 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: h11, httpcore, httpx, openai
Successfully installed h11-0.14.0 httpcore-1.0.5 h

In [9]:
from openai import OpenAI

### Requirements file

A `requirements.txt` file helps you to manage package dependencies for your project. It lists the package names with their version numbers that your project depends on.

You can create this file manually or generate it using `pip freeze > requirements.txt`.

The file should list one package per line

```txt
numpy==1.21.2
pandas==1.3.3
matplotlib==3.4.3
```

Use the `pip install -r requirements.txt` command to update the installed packages according to the new `requirements.txt` file.

In [10]:
%%writefile requirements.txt
beautifulsoup4
fake_useragent
imageio
keras ; python_version < '3.12'
lxml
matplotlib
numpy
opencv-python
pandas
pillow
# projectq  # uncomment once quantum/quantum_random.py is fixed
qiskit ; python_version < '3.12'
qiskit-aer ; python_version < '3.12'
requests
rich
# scikit-fuzzy  # uncomment once fuzzy_logic/fuzzy_operations.py is fixed
scikit-learn
statsmodels
sympy
tensorflow
tweepy
# yulewalker  # uncomment once audio_filters/equal_loudness_filter.py is fixed
typing_extensions
xgboost

Writing requirements.txt


In [11]:
!pip install -U openai



In [12]:
pip install -r requirements.txt

Collecting fake_useragent (from -r requirements.txt (line 2))
  Downloading fake_useragent-1.5.1-py3-none-any.whl (17 kB)
Collecting qiskit (from -r requirements.txt (line 12))
  Downloading qiskit-1.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.3/4.3 MB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting qiskit-aer (from -r requirements.txt (line 13))
  Downloading qiskit_aer-0.14.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.4/12.4 MB[0m [31m35.5 MB/s[0m eta [36m0:00:00[0m
Collecting rustworkx>=0.14.0 (from qiskit->-r requirements.txt (line 12))
  Downloading rustworkx-0.15.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m22.4 MB/s[0m eta [36m0:00:00[0m
Collecting dill>=0.3 (from qiskit-

## Using Modules from Packages

First, we need to install the packages using `pip`:

```shell
pip install pandas numpy
```

Then, you can import a module in your Python script using the `import` statement.

In [13]:
import pandas as pd
import numpy as np

# Create a simple dataset
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'Salary': [50000, 55000, 60000, 65000]
}

# Convert the dataset to a pandas DataFrame
df = pd.DataFrame(data)

# Calculate the mean of the Age column using numpy
mean_age = np.mean(df['Age'])
print(f'Mean Age: {mean_age}')

# Calculate the sum of the Salary column using pandas
total_salary = df['Salary'].sum()
print(f'Total Salary: {total_salary}')

# Add a new column named 'Salary_with_bonus' to the DataFrame
df['Salary_with_bonus'] = df['Salary'] * 1.1
print(df)

Mean Age: 32.5
Total Salary: 230000
      Name  Age  Salary  Salary_with_bonus
0    Alice   25   50000            55000.0
1      Bob   30   55000            60500.0
2  Charlie   35   60000            66000.0
3    David   40   65000            71500.0


## Exercise Modules and Packages

In [14]:
!pip install rggrader

Collecting rggrader
  Downloading rggrader-0.1.6-py3-none-any.whl (2.5 kB)
Installing collected packages: rggrader
Successfully installed rggrader-0.1.6


In [15]:
# @title #### Student Identity
student_id = "REAFCDNE" # @param {type:"string"}
name = "Riofebri Prasetia" # @param {type:"string"}

In [16]:
from rggrader import submit

In [22]:
# @title #### 00. Install and Use a Package
from rggrader import submit
import numpy as np

# TODO:
# 1. Import the numpy package with alias "np". Note: you do not need to install numpy using pip in this context, assume it is already installed.
# 2. Use numpy's function that can calculate the average of the list [1, 2, 3, 4, 5] and assign the output to a variable named 'average'.


# Put your code here:
average = 0
list_average = [1, 2, 3, 4, 5]
average = np.mean(list_average)
print(average)

# ---- End of your code ----

# Do not modify the code below. It is used to submit your solution.
assignment_id = "05-modules-and-packages"
question_id = "00_install_use_package"
submit(student_id, name, assignment_id, str(average), question_id)

# Example:
# Package: numpy with alias "np"
# Function: Calculate the sum of the list [10, 20, 30, 40, 50]
# Output: 150

3.0


'Assignment successfully submitted'

In [23]:
# @title #### 01. Standard Deviation of list elements
from rggrader import submit

# TODO:
# 1. Import numpy package with alias "np".
# 2. Use numpy's function that can calculate the standard deviation of elements in the list [1, 2, 3, 4, 5] and assign the output to a variable named 'std_dev'.


# Put your code here:
std_dev = 0
std_dev = np.std(list_average)
print(std_dev)
# ---- End of your code ----

# Do not modify the code below. It is used to submit your solution.
assignment_id = "05-modules-and-packages"
question_id = "02_std_dev_of_list"
submit(student_id, name, assignment_id, str(std_dev), question_id)

# Example:
# Package: numpy with alias "np"
# Function: Calculate the standard deviation of the list [10, 20, 30, 40, 50]
# Output: 14.142136

1.4142135623730951


'Assignment successfully submitted'

In [26]:
# @title #### 02. Create a DataFrame using pandas
from rggrader import submit
import pandas as pd
# TODO:
# 1. Import pandas package with alias "pd".
# 2. Use pandas DataFrame constructor to create a DataFrame from the dictionary: {"Name": ["Anna", "Bob", "Charlie"], "Age": [21, 25, 30]}.
#    Assign the output to a variable named 'df'.


# Put your code here:
df = None
data_dictionary = {"Name": ["Anna", "Bob", "Charlie"], "Age": [21, 25, 30]}
df = pd.DataFrame(data_dictionary)
print(df)
# ---- End of your code ----

# Do not modify the code below. It is used to submit your solution.
assignment_id = "05-modules-and-packages"
question_id = "03_create_dataframe"
submit(student_id, name, assignment_id, df.to_string(), question_id)

# Example:
# Package: pandas with alias "pd"
# Function: Create a DataFrame from the dictionary: {"Country": ["Finland", "Sweden", "Norway"], "Population": [5.5, 10.4, 5.4]}
# Output:
#    Country  |  Population
#    Finland  |  5.5
#    Sweden   |  10.4
#    Norway   |  5.4

      Name  Age
0     Anna   21
1      Bob   25
2  Charlie   30


'Assignment successfully submitted'