## CS50 AI - December 22, 2024
# 
# **Instructor:** Suwash Shrestha
# 
# **Welcome to CS50 AI!**
# 
# This course will introduce you to the fundamentals of artificial intelligence.


# Introduction to Python for Data Science

Python has emerged as the ideal choice for data science and machine learning for several compelling reasons. 
 
First and foremost, Python boasts a simple and readable syntax, which makes it accessible to both beginners and experienced programmers. This ease of use allows data scientists to focus on solving complex problems rather than getting bogged down by intricate programming details.
 
Additionally, Python has a rich ecosystem of libraries and frameworks specifically designed for data analysis and machine learning. Libraries such as NumPy and pandas provide powerful tools for data manipulation and analysis, while frameworks like TensorFlow and scikit-learn offer robust solutions for building and deploying machine learning models. This extensive library support accelerates the development process and enhances productivity.
 
Furthermore, Python's versatility allows it to be used across various domains, from web development to scientific computing. This flexibility means that data scientists can integrate their work with other technologies and platforms seamlessly.
 
The strong community support surrounding Python is another significant advantage. With a vast number of tutorials, forums, and documentation available, practitioners can easily find resources to help them troubleshoot issues or learn new techniques. This collaborative environment fosters innovation and knowledge sharing, which is crucial in the rapidly evolving field of data science.
 
Lastly, Python's compatibility with big data technologies, such as Apache Spark and Hadoop, makes it an excellent choice for handling large datasets. As data continues to grow in volume and complexity, Python's ability to work with these technologies ensures that data scientists can efficiently process and analyze data at scale.

In summary, Python's simplicity, extensive libraries, versatility, community support, and compatibility with big data technologies make it the ideal choice for data science and machine learning.


Let's review some topics

In [1]:
# Integer (int)
# Represents whole numbers, e.g., 5, -10.
age = 25
print(age)

# Try here: Uncomment the line below and complete the code to check if age is greater than 18
# if age > 18:
#     print("You are an adult.")
# else:
#     print("You are a minor.")

25


In [2]:
# Float (float)
# Represents numbers with decimals, e.g., 3.14, -0.01.
temperature = 36.6
print(temperature)

# Try here: Uncomment the line below and change the value of temperature to see how it affects the output
# temperature = 37.5
# print(temperature)

36.6


In [3]:
# String (str)
name = "Alice"
print(name)

# Try here: Uncomment the line below and change the value of name to see how it affects the output
# name = "Bob"
# print(name)

Alice


In [4]:
# Boolean (bool)
is_sunny = True
print(is_sunny)

# Try here: Uncomment the line below and change the value of is_sunny to see how it affects the output
# is_sunny = False
# print(is_sunny)

True


In [5]:
# Integer to string
age = 25
print("I am " + str(age) + " years old.") 

# Try here: Uncomment the line below and change the value of age to see how it affects the output
# age = 30
# print("I am " + str(age) + " years old.")

I am 25 years old.


In [6]:
# String to integer
value = "10"
print(int(value) + 5)

# Try here: Uncomment the line below and change the value of 'value' to see how it affects the output
# value = "20"
# print(int(value) + 5)

15


In [7]:
# Float to integer
pi = 3.14
print(int(pi))

# Try here: Uncomment the line below and change the value of pi to see how it affects the output
# pi = 2.71
# print(int(pi))

3


In [8]:
# Integer to Boolean
number = 0
print(bool(number)) 

# Try here: Uncomment the line below and change the value of number to see how it affects the output
# number = 1
# print(bool(number))

False



# Loops

In [9]:
# For Loops
numbers = [1, 2, 3, 4, 5]
for number in numbers:
    print(number)

# Try here: Uncomment the line below and change the values in the 'numbers' list to see how it affects the output
# numbers = [6, 7, 8, 9, 10]
# for number in numbers:
#     print(number)

1
2
3
4
5


In [10]:
# While loops
count = 5
while count > 0:
    print(count)
    count -= 1

# Try here: Uncomment the line below and change the value of 'count' to see how it affects the output
# count = 3
# while count > 0:
#     print(count)
#     count -= 1

5
4
3
2
1


In [11]:
# Example: Breaking out of a loop
for i in range(10):
    if i == 5:
        break
    print(i)

# Try here: Uncomment the line below and change the range to see how it affects the output
# for i in range(15):
#     if i == 10:
#         break
#     print(i)

0
1
2
3
4


In [12]:
# Example: Skipping an iteration
for i in range(5):
    if i == 2:
        continue
    print(i)

# Try here: Uncomment the line below and change the range to see how it affects the output
# for i in range(10):
#     if i == 4:
#         continue
#     print(i)

0
1
3
4


## Functions

In [13]:
# Example: Function that returns a value
def add(a, b):
    return a + b

# Try here: Uncomment the line below and change the values of 'a' and 'b' to see how it affects the output
# result = add(2, 3)
# print(result)

In [14]:
result = add(5, 3)
print(result)

# Try here: Uncomment the line below and change the values of 'a' and 'b' to see how it affects the output
# result = add(10, 7)
# print(result)

8


In [15]:
def greet(name="Guest"):
    print(f"Hello, {name}!")

# Try here: Uncomment the line below and change the name to see how it affects the output
# greet("YourName")


In [16]:
greet() 

Hello, Guest!


In [17]:
greet("Alice")

Hello, Alice!


# Lets try a code 

### This command will install the necessary libraries required for today
!pip install numpy pandas
!pip install matplotlib


In [None]:
!pip install numpy pandas
!pip install matplotlib

NumPy is a fundamental package for scientific computing in Python. It provides support for arrays, matrices, and a wide range of mathematical functions to operate on these data structures. NumPy is particularly useful for numerical data processing and is often used as the foundation for other libraries.

Pandas is a powerful data manipulation and analysis library for Python. It provides data structures like Series and DataFrame, which allow for easy handling of structured data. Pandas is widely used for data analysis tasks, including data cleaning, transformation, and visualization, making it an essential tool for data scientists and analysts.


In [18]:
import numpy as np
import pandas as pd


### Create a small dataset using Numpy
 
 Generate random scores for 5 students in 3 subjects (Math, Science, English)

In [19]:
np.random.seed(42)  # For reproducibility
num_students = 5
subjects = ['Math', 'Science', 'English']
scores = np.random.randint(50, 101, size=(num_students, len(subjects)))

### What is Dataframe ?
A DataFrame is a two-dimensional, size-mutable, potentially heterogeneous tabular data structure 
with labeled axes (rows and columns) in the Pandas library. It is similar to a spreadsheet or SQL table, 
or a dictionary of Series objects. DataFrames are widely used for data manipulation and analysis 
because they allow for easy handling of missing data, alignment of data, and various operations 
such as filtering, grouping, and aggregating. Each column in a DataFrame can be of a different data type, 
making it a versatile tool for data analysis in Python.


In [20]:
# 2. Create a Pandas DataFrame from the Numpy array
df = pd.DataFrame(scores, columns=subjects)

In [21]:
# 3. Add additional columns
# Adding a column for student names (Student_1, Student_2, ...)
df['Student'] = [f'Student_{i+1}' for i in range(num_students)]
# Adding a column for the average score
df['Average'] = df[subjects].mean(axis=1)
# Adding a column to classify students as 'Pass' or 'Fail' based on their average score (>60 is Pass)
df['Status'] = np.where(df['Average'] > 60, 'Pass', 'Fail')

print("--- Dataset ---")
print(df)

print("\n--- Subject Averages ---")
print(df[subjects].mean())

print("\n--- Pass/Fail Counts ---")
print(df['Status'].value_counts())


--- Dataset ---
   Math  Science  English    Student    Average Status
0    88       78       64  Student_1  76.666667   Pass
1    92       57       70  Student_2  73.000000   Pass
2    88       68       72  Student_3  76.000000   Pass
3    60       60       73  Student_4  64.333333   Pass
4    85       89       73  Student_5  82.333333   Pass

--- Subject Averages ---
Math       82.6
Science    70.4
English    70.4
dtype: float64

--- Pass/Fail Counts ---
Status
Pass    5
Name: count, dtype: int64


**Instructor: Pawan Shah**  
Topic: Advance Python  
Date: 22nd December, 2024  

Hints, also known as type hints or type annotations, are a feature in Python that allows you to specify the expected data types of function arguments and return values. This helps improve code readability and maintainability by providing clear information about what types of data are expected. Type hints do not enforce type checking at runtime but can be used by static type checkers, IDEs, and linters to catch potential type-related errors during development. 

In [1]:
#Hints
def add(a: int, b: int) -> int:
    return a + b

add(2, 3)

5

### What is a Docstring?

A docstring, short for documentation string, is a special type of comment used to describe the purpose, usage, and behavior of a function, method, class, or module in Python. It is enclosed within triple quotes (`"""` or `'''`) and is placed immediately after the definition of the function, method, class, or module.

#### Uses of Docstrings:
1. **Documentation**: Docstrings provide a convenient way to document the code, making it easier for developers to understand the functionality and usage of the code.
2. **Help Function**: Python's built-in `help()` function can be used to display the docstring of a function, method, class, or module, providing quick access to the documentation.
3. **Code Readability**: Docstrings improve code readability by providing clear and concise descriptions of what the code does, its parameters, and its return values.
4. **API Documentation**: Docstrings can be used to generate API documentation automatically using tools like Sphinx, making it easier to maintain and update the documentation.

#### Advantages of Docstrings:
1. **Improved Code Understanding**: Docstrings help developers understand the code quickly without having to read through the entire implementation.
2. **Ease of Maintenance**: With docstrings, maintaining and updating documentation becomes easier, as the documentation is written alongside the code.
3. **Consistency**: Using docstrings ensures a consistent approach to documenting code, which is beneficial for large projects with multiple contributors.
4. **Enhanced Collaboration**: Well-documented code with docstrings facilitates better collaboration among team members, as everyone can easily understand the code's purpose and usage.

In [3]:
#Docstrings
def greet(name: str) -> str:
    """
    Greets the user by name.

    Args:
        name (str): The name of the user.

    Returns:
        str: A greeting message.
    """
    return f"Hello, {name}"
greet("Pawan")

'Hello, Pawan'

### Modular Programming

Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules. Each module contains everything necessary to execute only one aspect of the desired functionality. This approach enhances code readability, maintainability, and reusability.

#### Advantages of Modular Programming:
1. **Improved Readability**: By breaking down a program into smaller, manageable modules, the code becomes easier to read and understand.
2. **Ease of Maintenance**: Modules can be developed, tested, and debugged independently, making it easier to maintain and update the code.
3. **Reusability**: Modules can be reused across different programs or projects, reducing redundancy and development time.
4. **Collaboration**: Modular programming allows multiple developers to work on different modules simultaneously, facilitating better collaboration and parallel development.


In [7]:
#Modular Programming
from mathOp import PID

pid = PID(0.1, 0.01, 0.05)
pid.setpoint = 1.0
pid.update(0.5)


0.08

### Exception Handling

Exception handling is a mechanism in programming to handle runtime errors, ensuring the normal flow of the program is maintained. Python provides a way to handle exceptions using the `try`, `except`, `else`, and `finally` blocks.

#### Key Components:
1. **try block**: Code that might raise an exception is placed inside the `try` block.
2. **except block**: Code that handles the exception is placed inside the `except` block.
3. **else block**: Code that runs if no exception occurs is placed inside the `else` block.
4. **finally block**: Code that runs no matter what (whether an exception occurs or not) is placed inside the `finally` block.


### Custom Exceptions

Custom exceptions allow you to define your own error types, making your code more readable and providing more specific error messages. Custom exceptions are created by subclassing the built-in `Exception` class.



In [8]:
#Exceptions
try:
    val = 1/0
except ZeroDivisionError as e: 
    print(e)
else:
    print("No exception")
    print(val)
finally:
    print("Finally block")

division by zero
Finally block


In [9]:
#Custom Exceptions
class InvalidAgeError(Exception):
    """Exception raised for invalid age input."""
    def __init__(self, age, message="Age must be between 0 and 120."):
        self.age = age
        self.message = message
        super().__init__(self.message)

    def __str__(self):
        return f"{self.message} You entered: {self.age}"


In [10]:
def set_age(age):
    if age < 0 or age > 120:
        raise InvalidAgeError(age)
    print(f"Age set to: {age}")

try:
    set_age(-5)
except Exception as e:
    print(e)

Age must be between 0 and 120. You entered: -5
