## Python for Machine Learning & Mathematical Modeling (MoH Uganda)
    By 
    Nuhuh Mutebi 
    Chief of Data Products & Training
    (Redea & Redea Institute of Data Science)

## 1. What is Python?
- Python is a high-level, interpreted programming language known for its simplicity, readability, and versatility. 
- It offers a rich ecosystem of libraries and frameworks, making it popular across various domains, including machine learning and mathematical modeling.


In [1]:
# Example: Hello, World! in Python
print("Hello, World!")

Hello, World!


## 2. Why Python for Machine Learning?
- Python's ease of use, flexibility, and extensive library support make it well-suited for machine learning and mathematical modeling tasks. 
- Libraries like NumPy, SciPy, pandas, and scikit-learn provide robust tools for data manipulation, analysis, and modeling.

In [2]:
# Example: Using NumPy for numerical operations
import numpy as np

name = np.array([1, 2, 3, 4, 5])
mean = np.mean(arr)
print("Mean:", mean)
print(type(name))

NameError: name 'arr' is not defined

## 3. Getting Started with Python - Installing Anaconda (Optional)

# 4. Python Basics - Syntax
- Comments
- variables
- Indentation
- Print

In [3]:
# Example: Variables and basic operations - This is a comment
"""
this is mult
line commment
example
"""
x = 11
y = 5

x %= y # = 15
print(x)
# x = x + 1
z = x + y
print("Sum:", z)

1
Sum: 6


In [4]:
# Indentation
if z > 5:
    print("Z is small")
else:
    print('Z is small')

Z is small


## 5. Python Data Structures
Python offers various data structures to store and manipulate data efficiently. 
Here are some essential data structures:

- Numeric types (integers, floats, complex numbers)
- Strings
- Lists
- Tuples
- Dictionaries

In [5]:
name = "9"
print(type(name))
name1 = name + '4'
print(name1)
name1_int = int(name1)
print(type(name1_int))
print(name1_int+3)

<class 'str'>
94
<class 'int'>
97


In [6]:
# Numeric
# Example: Numeric types
# Python supports integers, floats, and complex numbers.
x = 10          # Integer
y = 3.14        # Float
z = 2 + 3j      # Complex number
print("Integer:", x)
print("Float:", y)
print("Complex:", z)

Integer: 10
Float: 3.14
Complex: (2+3j)


In [7]:
# Example: Strings
# Strings are sequences of characters enclosed in single or double quotes.
message = "Hello, World!"
print(message)

Hello, World!


In [8]:
# Example: Lists and list operations
fruits = ['apple', 'banana', 'cherry']
print(len(fruits))
fruits1 = fruits
fruits[-2] = 'another'
fruits.append('orange')
print("Fruits:", fruits1)

3
Fruits: ['apple', 'another', 'cherry', 'orange']


In [9]:
# Example: Dictionaries
# Dictionaries store key-value pairs.
student = {
    "name": "Alice",
    "age": 25,
    "grade": "A"
}
print("Student:", student)
print("Name:", student["name"])

Student: {'name': 'Alice', 'age': 25, 'grade': 'A'}
Name: Alice


In [10]:
age = float(input("How old are you?: "))
print(age)
print(type(age))

How old are you?: 21
21.0
<class 'float'>


## 6. Python Functions
- Functions are reusable blocks of code that perform specific tasks. They enhance code modularity and organization. 
- Here's how to define and use functions in Python:


In [11]:
# Example: Function definition and usage
def greet(name):
    return "Hello, " + name + "!"

message = greet(input("What is your name"))
print(message)

What is your nameNUHUH
Hello, NUHUH!


### Functions and dictionaries in Data manipulation and cleaning


In [None]:
# Define a function to map schools to sub counties
def map_rural_urban(school):
    # Define mapping of schools to sub counties
    sub_ru_map = {
        'Redea SSS': 'Rural',
        
        # More mappings to be added
    }
    # Strip leading and trailing whitespace from the school name
    school = school.strip()
    
    # Print the school name for debugging
    # print("School name:", school)
    
    # Return the corresponding sub county for the given school
    return sub_ru_map.get(school, 'Unknown')  # Default to 'Unknown' if not found

# Add a new column 'sub_county' and populate it using the map_sub_county function
chem['rural_urban'] = chem['school'].apply(map_rural_urban)

# Reorder the columns to move 'sub_county' into the 2nd position
cols = chem.columns.tolist()
cols.insert(2, cols.pop(cols.index('rural_urban')))
chem = chem[cols]

# Save the modified DataFrame back to a CSV file if needed
eng.to_csv('modified_eng.csv', index=False)

# Display the DataFrame with the new column order
chem.head()


## 7. Getting Started with Modules or Packages
- Python modules and packages help organize code and facilitate code reuse. 
- Let's explore how to import and use external modules/packages like NumPy and pandas:

#### 1. Numpy
- NumPy is a fundamental package for scientific computing with Python. 
- It provides support for arrays, matrices, and mathematical functions for operations on these data structures.

In [13]:
# Example: Importing and using NumPy
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
mean = np.mean(arr)
print("Mean:", mean)

Mean: 3.0


#### 2. Pandas
- pandas is a powerful data manipulation and analysis library for Python. 
- It provides data structures like DataFrame and Series, along with tools for reading and writing data from various file formats.

In [14]:
import pandas as pd

# Create a DataFrame from a dictionary
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35]}
df = pd.DataFrame(data)

# Display the DataFrame
print(df)


      Name  Age
0    Alice   25
1      Bob   30
2  Charlie   35


#### 3. scikit-learn
- scikit-learn is a machine learning library for Python that provides simple and efficient tools for data mining and data analysis. 
- It includes various algorithms for classification, regression, clustering, dimensionality reduction, and more.

In [15]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Evaluate the model
accuracy = model.score(X_test, y_test)
print("Accuracy:", accuracy)


Accuracy: 1.0




#### 4. matplotlib
- matplotlib is a plotting library for Python that produces publication-quality figures in a variety of formats and interactive environments.

In [16]:
# PLEASE UNCOMMENT THE LINES BELOW WHEN RUNNING THIS LOCALLY
# import matplotlib.pyplot as plt

# # Create sample data
# x = np.linspace(0, 10, 100)
# y = np.sin(x)

# # Plot the data
# plt.plot(x, y)
# plt.xlabel('X-axis')
# plt.ylabel('Y-axis')
# plt.title('Sine Wave')
# plt.show()

<Figure size 640x480 with 1 Axes>