# Python

## Python Theory Notes

---

### **Week 1**

1. **What is Python?**  
   High-level, interpreted, dynamically typed, general-purpose programming language.

2. **Benefits of Python**  
   - High-level data structures  
   - Easy to read/write  
   - Large community & library support  
   - Open-source  
   - Great for rapid development  
   - Easily integrates with other languages

3. **Dynamically Typed**  
   Variable types are determined at runtime.  
   - **Strongly typed**: `"2" + 2` → Error  
   - **Not weakly typed**

4. **PEP 8**  
   Python Enhancement Proposal for style guide and coding conventions.

5. **Common Built-in Data Types**  
   - `NoneType` → `None`  
   - Numeric → `int`, `float`, `complex`  
   - Sequence → `str`, `list`, `tuple`, `range`  
   - Mapping → `dict`  
   - Set → `set`, `frozenset`  
   - Callable → `function`, `lambda`, `class`

6. **Operator Precedence**  
   Defines order in which operations are evaluated:  
   `() > ** > * / // % > + - > comparison > not > and > or`

7. **Ternary Operator**  
   `x if condition else y`

8. **Identity Operators**  
   - `is`, `is not`: Compare memory location (identity), not values.  
     Example: `a is b`

9. **Disadvantages of Python**  
   - Slower than compiled languages  
   - Not great for mobile apps  
   - High memory usage  
   - GIL (Global Interpreter Lock) limits threading

10. **How strings/tuple/list are stored?**  
    - **Strings & Tuples**: Immutable, stored in memory once (interned sometimes)  
    - **Lists**: Mutable, stored as array of references

11. **Zen of Python**  
    `import this`  
    - Set of guiding principles by Tim Peters (e.g., "Simple is better than complex")

12. **`is` vs `==`**  
    - `is`: Identity check (same memory)  
    - `==`: Value check

13. **`_` Variable**  
    - Holds result of last expression in interactive shell  
    - Used as throwaway variable: `_ = something`

14. **Module, Package, Library**  
    - **Module**: Single `.py` file  
    - **Package**: Folder with `__init__.py` containing modules  
    - **Library**: Collection of modules/packages with specific purpose

15. **Docstring**  
    First string in a function/module/class. Used for documentation.  
    Access via `help()` or `.__doc__`

---

### **Week 2**

16. **Aliasing**  
    Two variables refer to the same object:  
    ```python
    a = [1, 2]; b = a
    ```

17. **Garbage Collection**  
    - Python uses reference counting + cyclic GC  
    - `gc` module for manual control

18. **Mutability**  
    - **Mutable**: Can change (list, dict, set)  
    - **Immutable**: Cannot change (int, float, str, tuple)

19. **Cloning / Copying**  
    - **Shallow Copy**: `copy.copy()` → Top-level only  
    - **Deep Copy**: `copy.deepcopy()` → Recursively copies all levels  
    - Also: `new_list = old_list[:]` (shallow for lists)

### **Week 3**

1. **Set Indexing**
  - **Sets are unordered**, so they don't support indexing or slicing like lists.
  - Internally use **hash tables** — fast membership testing (`in`), similar to `dict`.
  - Useful when **search speed** matters over order.


2. **Why Dict Keys Can't Be Mutable**
  - Keys must be **hashable** (i.e., implement `__hash__()` and `__eq__()`).
  - Mutable types (e.g., list, dict) can change → hash value changes → breaks key lookup.



3. **`enumerate()` & `sorted()`**
  - `enumerate(iterable)` → Adds index to iterable.
  ```python
  for i, val in enumerate(['a', 'b']): ...
  ```
  - `sorted(iterable, key=...)` → Custom sort logic.
  ```python
  sorted(data, key=lambda x: x.age)
  ```


4. **Destructuring**
  - Unpack iterable elements into variables.
  ```python
  a, b = [1, 2]
  name, *rest = ["Alice", 25, "India"]
  ```


5. **`dir()` / `isinstance()` / `issubclass()`**
  - `dir(obj)` → Lists attributes/methods.
  - `isinstance(obj, Class)` → Checks instance.
  - `issubclass(Sub, Base)` → Checks inheritance.


6. **`@classmethod` vs `@staticmethod`**
  - `@classmethod(cls)` → Accesses class, often used as alternative constructors.
  - `@staticmethod()` → No `self` or `cls`; utility functions inside a class.


7. **Diamond Problem**
  - Occurs in **multiple inheritance** when a class inherits from two classes that share a common base.
  - Python uses **MRO (Method Resolution Order)** to resolve this.


8. **`_` / `__` in Variables & Methods**
  - `_var`: Convention for "internal use".
  - `__var`: Name mangling to avoid name clashes.
  - `__method__`: Dunder methods like `__init__`, `__len__`.


9. **`__repr__` vs `__str__`**
  - `__str__`: User-friendly string (used by `print()`).
  - `__repr__`: Developer-friendly, used in debugging and console output.
  ```python
  repr(obj) ≠ str(obj)
  ```


10. **Objects in Sets**
  - Sets require **hashable** items, not necessarily immutable.
  - Custom objects must implement `__hash__` and `__eq__`.


11. **`__name__ == "__main__"`**
  - Checks if script is run directly or imported as module.
  ```python
  if __name__ == "__main__": main()
  ```
  - Useful for testing or entry-point logic.


12. **Python Packages (`__init__.py`)**
  - A **package** is a directory with an `__init__.py` file (can be empty).
  - `__init__.py` can control what gets imported.


13. **Third-Party Packages**
  - Installed via tools like `pip`.
  ```bash
  pip install requests
  ```
  - Usually found on [PyPI](https://pypi.org/).
  - Import like built-in modules after installation.



## Data types

### **String (str)**  

- **Immutable**, **ordered**, **indexed**, **iterable**, supports **slicing**.  
- Enclosed in `' '`, `" "`, `''' '''`, or `""" """`.

---

#### **Basic Functions**  
```python
len(s)       # Length  
max(s)       # Max char (by Unicode)  
min(s)       # Min char  
sorted(s)    # Returns sorted list of characters
```

---

#### **String Methods**  

**Case Handling:**  
```python
s.capitalize()     # First char uppercase  
s.title()          # Each word capitalized  
s.upper()          # All uppercase  
s.lower()          # All lowercase  
s.swapcase()       # Swap case
```

**Searching & Counting:**  
```python
s.count(x)         # Count occurrences  
s.find(x)          # First index, -1 if not found  
s.index(x)         # Like find, but error if not found  
s.startswith(x)    # True if starts with x  
s.endswith(x)      # True if ends with x
```

**Validation:**  
```python
s.isalnum()        # Letters & digits  
s.isalpha()        # Only letters  
s.isdigit()        # Only digits  
s.isidentifier()   # Valid identifier
```

**Modification:**  
```python
s.split(sep)       # Split into list  
s.join(list)       # Join list with separator  
s.replace(a, b)    # Replace a with b  
s.strip()          # Remove whitespace
```

**Formatting:**  
```python
"{} is {}".format(x, y)  
f"{x} is {y}"      # f-string
```

### **List (list)**

- **Mutable**, **ordered**, **indexed**, **iterable**, supports **slicing**.  
- Can contain mixed data types. Defined with `[ ]`.

---

#### **Basic Functions**  
```python
len(lst)         # Number of items  
min(lst)         # Min value  
max(lst)         # Max value  
sorted(lst)      # Returns new sorted list
```

---

#### **List Methods**  

**Adding Elements:**  
```python
lst.append(x)      # Add at end  
lst.extend(iter)   # Add all from iterable  
lst.insert(i, x)   # Insert at index
```

**Removing Elements:**  
```python
lst.remove(x)      # Remove first match  
lst.pop(i)         # Remove & return item at i (default last)  
lst.clear()        # Remove all items
```

**Info & Utilities:**  
```python
lst.count(x)       # Count occurrences  
lst.index(x)       # Index of first match  
lst.reverse()      # In-place reverse  
lst.sort()         # In-place sort (use key= or reverse=)  
sorted(lst)        # Returns new sorted list
```

**Copying:**  
```python
lst2 = lst.copy()      # Shallow copy  
lst2 = lst[:]          # Another shallow copy way
```

**Zipping:**  
```python
zip(lst1, lst2)        # Pair elements from multiple lists  
list(zip(lst1, lst2))  # Convert to list of tuples
```

---

#### **Other Notes**  
- `in` keyword checks for membership: `x in lst`  
- `for item in lst:` to iterate  
- Use list comprehension: `[x*2 for x in lst if x > 0]`  
- Nested lists supported: `lst = [[1, 2], [3, 4]]`



### **Dictionary (dict)**  

- **Mutable**, **unordered** (insertion-ordered since Python 3.7+), **key-value** pairs.  
- Keys must be **unique** and **immutable** (e.g., str, int, tuple).

---

#### **Basic Operations**  
```python
len(d)           # Number of key-value pairs  
sorted(d)        # Sorted list of keys
```

---

#### **Access / Retrieval**  
```python
d[key]           # Get value (KeyError if not found)  
d.get(key)       # Get value or None if not found  
d.items()        # All (key, value) pairs  
d.keys()         # All keys  
d.values()       # All values
```

---

#### **Modification**  
```python
d.update({k: v})     # Add or update entries  
d[key] = value       # Add/update a single key
```

---

#### **Deletion**  
```python
d.pop(key)           # Remove by key (KeyError if not found)  
d.popitem()          # Remove last inserted item  
del d[key]           # Delete specific key  
d.clear()            # Remove all items
```

---

#### **Other Notes**  
- `in` keyword checks key: `if key in d:`  
- Dictionaries can be nested: `d = {"a": {"b": 1}}`  
- Dictionary comprehension: `{k: k**2 for k in range(5)}`  


### **Tuple (tuple)**  

- **Immutable**, **ordered**, **indexed**, supports **slicing**.  
- Defined with `( )`, single item must have comma: `(1,)`.

---

#### **Basic Functions**  
```python
len(t)           # Number of elements  
sum(t)           # Sum (if numeric)  
min(t), max(t)   # Min/Max value  
sorted(t)        # Returns a sorted list (not a tuple)
```

---

#### **Tuple Methods**  
```python
t.count(x)       # Count occurrences  
t.index(x)       # Index of first match
```

---

#### **Zipping & Unpacking**  
```python
zip(t1, t2)                  # Combine tuples into pairs  
list(zip(t1, t2))            # Convert zipped object to list

a, b = (1, 2)                # Unpacking  
a, *b, c = (1, 2, 3, 4)      # Extended unpacking
```

---

#### **Other Notes**  
- Tuples can be nested: `t = ((1, 2), (3, 4))`  
- Use tuple when data shouldn’t change (e.g., coordinates, fixed config)

### **Set (set)**  

- **Unordered**, **mutable**, **no duplicates**, **iterable**.  
- Defined with `{}` or `set()`. Cannot contain mutable items like lists.

---

#### **Basic Functions**  
```python
len(s)           # Number of elements  
sum(s), min(s), max(s)  
sorted(s)        # Returns a sorted list
```

---

#### **Adding & Removing Elements**  
```python
s.add(x)               # Add one item  
s.update([x, y])       # Add multiple items  
s.discard(x)           # Remove if exists (no error)  
s.remove(x)            # Remove or raise KeyError  
s.pop()                # Remove & return random item  
s.clear()              # Remove all items
```

---

#### **Set Operations**  
```python
s.union(t)                   # s ∪ t  
s.update(t)                  # s |= t (in-place union)

s.intersection(t)            # s ∩ t  
s.intersection_update(t)     # s &= t

s.difference(t)              # s - t  
s.difference_update(t)       # s -= t

s.symmetric_difference(t)          # s ⊕ t  
s.symmetric_difference_update(t)   # s ^= t
```

---

#### **Comparisons**  
```python
s.isdisjoint(t)        # True if no common items  
s.issubset(t)          # True if s ⊆ t  
s.issuperset(t)        # True if s ⊇ t
```

---

#### **Copy & Frozen Set**  
```python
s2 = s.copy()          # Shallow copy  
fs = frozenset(s)      # Immutable set
```

---

#### **Looping**  
```python
for item in s:         # Iteration  
    print(item)
```


In [None]:
# Tuple
len/sum/min/max/sorted
count index
zipping and unpackaing tuple

In [None]:
# Set
add / update / discard / remove / pop
set operations
# len/sum/min/max/sorted
# union/update
# intersection/intersection_update
# difference/difference_update
# symmetric_difference/symmetric_difference_update
# isdisjoint/issubset/issuperset
copy
forzen set

## OOP

---

### **Basics**

- **Class**: Blueprint for creating objects; defines **data (attributes)** and **functions (methods)**.
- **Object**: Instance of a class; created using `obj = ClassName()`.

---

### **Special Concepts**

- **Constructor**:  
  `__init__()` → special method auto-called on object creation.

- **Method vs Function**:  
  - **Method**: Defined inside class, takes `self`  
  - **Function**: Independent, no `self`

- **Magic/Dunder Methods**:  
  Methods with `__name__` (e.g., `__init__`, `__str__`, `__len__`, `__add__`)

- **`self`**: Refers to the current instance; used to access attributes/methods inside class.

- **Reference Variable**:  
  `obj1 = obj2` → both refer to the same object.

- **Pass by Reference**:  
  Python passes objects by reference (not copy). Changes inside method may affect original object if mutable.

- **Mutability**:  
  - **Mutable objects** can be changed (e.g., list)  
  - **Immutable objects** (e.g., int, str) cannot

---

### **Encapsulation**

- **Private Members**: Prefix with `__` (name mangling: `_ClassName__var`)  
- **Getter/Setter**: Access private data via methods  
  ```python
  def get_x(self): return self.__x  
  def set_x(self, val): self.__x = val
  ```

- **Can be used in**: Lists, dicts, sets  
  Objects can be stored and manipulated using standard containers.

---

### **Static vs Instance**

- **Instance Variable/Method**:  
  Belongs to each object, accessed via `self`.

- **Static Method**:  
  Decorated with `@staticmethod`, no `self`, belongs to class.

- **Class Method**:  
  Decorated with `@classmethod`, takes `cls` instead of `self`.

---

### **Aggregation**

- A class contains object of another class (has-a).  
- Private attributes of contained class **cannot be accessed** directly.

---

### **Inheritance**

- **What gets inherited?**  
  - Constructor if not overridden  
  - Non-private attributes & methods  
  - Use `super().__init__()` to call parent constructor

- **Super Keyword**  
  Used **inside a class** to access parent methods/constructor.  
  Cannot directly access variables.

- **Types of Inheritance**  
  - Single  
  - Multiple  
  - Multilevel  
  - Hierarchical  
  - Hybrid

---

### **Polymorphism**

- **Method Overriding**: Redefine parent method in child.  
- **Method Overloading**: Not supported natively; use default args.  
- **Operator Overloading**:  
  Define magic methods like `__add__`, `__lt__`, `__str__`.

---

### **Abstraction**

- Hides internal details, shows only essential features.  
- Done using **abstract classes** (`abc` module) and **interfaces**.  
  ```python
  from abc import ABC, abstractmethod
  class A(ABC):
      @abstractmethod
      def foo(self): pass
  ```



## File handling & Serialization + Deserialization

---

### **Basic File Modes**

| Mode | Meaning               |
|------|------------------------|
| `'r'` | Read (default)         |
| `'w'` | Write (overwrite)      |
| `'a'` | Append                 |
| `'x'` | Create (error if exists) |
| `'b'` | Binary mode            |
| `'t'` | Text mode (default)    |
| `'+'` | Read and Write         |

---

### **Opening & Reading Files**

```python
f = open('file.txt', 'r')
content = f.read()           # Read full content
f.close()
```

```python
with open('file.txt', 'r') as f:
    lines = f.readlines()   # List of all lines
```

```python
with open('file.txt') as f:
    for line in f:
        print(line.strip()) # Line-by-line read
```

---

### **Writing to Files**

```python
with open('file.txt', 'w') as f:
    f.write("Hello\nWorld")  # Overwrites file
```

```python
with open('file.txt', 'a') as f:
    f.write("\nAppended line")  # Adds to file
```

---

### **Read + Write**

```python
with open('file.txt', 'r+') as f:
    data = f.read()
    f.seek(0)
    f.write("Updated\n")
```

---

### **Writing List to File**

```python
lines = ["Line1\n", "Line2\n"]
with open('file.txt', 'w') as f:
    f.writelines(lines)
```

---

### **File Check (Optional)**

```python
import os
os.path.exists('file.txt')       # Check if file exists  
os.remove('file.txt')            # Delete file  
os.rename('old.txt', 'new.txt')  # Rename file  
```

---

### **Working with Binary Files**

```python
with open('image.jpg', 'rb') as f:
    data = f.read()

with open('copy.jpg', 'wb') as f:
    f.write(data)
```

---

### **Common Utility**

```python
with open('file.txt') as f:
    first_line = f.readline()
```

```python
with open('file.txt') as f:
    word_count = sum(len(line.split()) for line in f)
```

---

### Serialization & Deserialization in Python

### 🔄 What is Serialization?

Serialization is the process of converting a Python object into a format that can be easily saved to a file or transmitted over a network.

#### 🔁 What is Deserialization?

Deserialization is the reverse process — converting the serialized data back into a Python object.

---

### 📦 Using `json` Module (For Text-Based Serialization)

The `json` module allows for serialization and deserialization of basic Python data types (dict, list, int, float, str, bool, None).

### ✅ JSON Serialization - `json.dump()` / `json.dumps()`

```python
import json

data = {"name": "Alice", "age": 30}
json_string = json.dumps(data)  # Converts to JSON string
with open("data.json", "w") as f:
    json.dump(data, f)  # Saves JSON to file
```

### ✅ JSON Deserialization - `json.load()` / `json.loads()`

```python
with open("data.json", "r") as f:
    loaded_data = json.load(f)  # Reads JSON from file

json_string = '{"name": "Alice", "age": 30}'
data = json.loads(json_string)  # Converts JSON string to Python dict
```

### ⚠️ Custom Objects are Not JSON Serializable

```python
class Person:
    def __init__(self, name):
        self.name = name

person = Person("Alice")
json.dumps(person)  # ❌ Raises TypeError
```

#### 🔧 To Serialize Custom Objects with JSON

Use a custom encoder or convert object to dict manually:

```python
class Person:
    def __init__(self, name):
        self.name = name

    def to_dict(self):
        return {"name": self.name}

person = Person("Alice")
json_string = json.dumps(person.to_dict())  # ✅ Works
```

---

### 🥒 Using `pickle` Module (For Binary Serialization)

The `pickle` module supports serialization of **any Python object**, including custom classes.

#### ✅ Pickling (Serialization)

```python
import pickle

data = {"name": "Alice", "age": 30}
with open("data.pkl", "wb") as f:
    pickle.dump(data, f)
```

#### ✅ Unpickling (Deserialization)

```python
with open("data.pkl", "rb") as f:
    loaded_data = pickle.load(f)
```

#### ⚠️ Pickle Considerations

- **Binary Format**: Not human-readable.
- **Security Warning**: **Never unpickle data from untrusted sources!**
- **Version Sensitive**: May not be portable across Python versions or platforms.

---

### ✅ When to Use What?

| Format  | Module   | Use Case                           | Human-Readable | Supports Custom Objects |
|---------|----------|-------------------------------------|----------------|--------------------------|
| JSON    | `json`   | APIs, config files, data exchange   | ✅ Yes          | ❌ No (without conversion) |
| Pickle  | `pickle` | Storing Python objects, ML models   | ❌ No           | ✅ Yes                    |

---

### 🔚 Summary

- Use **JSON** for interoperability and readability.
- Use **Pickle** for internal use with complex Python objects.
- Avoid unpickling untrusted data due to security risks.


## Execption Handling

### What is Exception Handling?

Exception handling allows a program to respond to runtime errors gracefully, instead of crashing.

---

### Basic Syntax

```python
try:
    # Code that might raise an exception
    x = 10 / 0
except ZeroDivisionError:
    print("Cannot divide by zero.")
```

---

### Common Clauses

- **`try`**: Code block to monitor.
- **`except`**: Block that handles the exception.
- **`else`**: Executes if no exception occurs.
- **`finally`**: Executes no matter what (useful for cleanup).

```python
try:
    x = int("10")
except ValueError:
    print("Invalid input.")
else:
    print("Conversion successful.")
finally:
    print("This always runs.")
```

---

### Catching Multiple Exceptions

```python
try:
    # code
except (TypeError, ValueError) as e:
    print(f"Error: {e}")
```

---

### Raising Exceptions

```python
raise ValueError("Invalid value provided.")
```

---

### Custom Exceptions

```python
class MyError(Exception):
    pass

raise MyError("Something went wrong.")
```

---

### Summary

- Handle only specific exceptions you expect.
- Use `finally` for resource cleanup (e.g., file closing).
- Avoid bare `except:` — it hides real bugs.


### Common Python Exceptions

| Exception            | Description |
|----------------------|-------------|
| `SyntaxError`        | Raised when Python code is invalid or incorrectly written. |
| `NameError`          | Raised when a variable is not defined. |
| `TypeError`          | Raised when an operation is applied to an object of inappropriate type. |
| `ValueError`         | Raised when a function gets a correct type but inappropriate value. |
| `IndexError`         | Raised when a list index is out of range. |
| `KeyError`           | Raised when a key is not found in a dictionary. |
| `AttributeError`     | Raised when an attribute reference is invalid. |
| `ZeroDivisionError`  | Raised when dividing a number by zero. |
| `FileNotFoundError`  | Raised when a file or directory is not found. |
| `ImportError`        | Raised when an import statement fails. |
| `IndentationError`   | Raised for incorrect indentation. |
| `RuntimeError`       | Raised when no other category fits but something went wrong during execution. |
| `StopIteration`      | Raised when the next() function exceeds iterator items. |
| `AssertionError`     | Raised when an `assert` statement fails. |
| `MemoryError`        | Raised when an operation runs out of memory. |
| `EOFError`           | Raised when `input()` hits end-of-file condition (no more data). |

> Note: All exceptions inherit from the `BaseException` class, with `Exception` being the base for most user-related errors.


# Fast API

## Pydantic

In [None]:
from pydantic import BaseModel, EmailStr, AnyUrl, Field
from typing import List, Dict, Optional, Annotated

class Patient(BaseModel):

    name: Annotated[str, Field(max_length=50, title='Name of the patient', description='Give the name of the patient in less than 50 chars', examples=['Nitish', 'Amit'])]
    email: EmailStr
    linkedin_url: AnyUrl
    age: int = Field(gt=0, lt=120)
    weight: Annotated[float, Field(gt=0, strict=True)]
    married: Annotated[bool, Field(default=None, description='Is the patient married or not')]
    allergies: Annotated[Optional[List[str]], Field(default=None, max_length=5)]
    contact_details: Dict[str, str]


def update_patient_data(patient: Patient):

    print(patient.name)
    print(patient.age)
    print(patient.allergies)
    print(patient.married)
    print('updated')

patient_info = {'name':'nitish', 'email':'abc@gmail.com', 'linkedin_url':'http://linkedin.com/1322', 'age': '30', 'weight': 75.2,'contact_details':{'phone':'2353462'}}

patient1 = Patient(**patient_info)

update_patient_data(patient1)

In [None]:
# Field validator

class Patient(BaseModel):

    name: str
    email: EmailStr
    age: int
    weight: float
    married: bool
    allergies: List[str]
    contact_details: Dict[str, str]

    @field_validator('email')
    @classmethod
    def email_validator(cls, value):

        valid_domains = ['hdfc.com', 'icici.com']
        # abc@gmail.com
        domain_name = value.split('@')[-1]

        if domain_name not in valid_domains:
            raise ValueError('Not a valid domain')

        return value
    
    @field_validator('name')
    @classmethod
    def transform_name(cls, value):
        return value.upper()
    
    @field_validator('age', mode='after')
    @classmethod
    def validate_age(cls, value):
        if 0 < value < 100:
            return value
        else:
            raise ValueError('Age should be in between 0 and 100')


def update_patient_data(patient: Patient):

    print(patient.name)
    print(patient.age)
    print(patient.allergies)
    print(patient.married)
    print('updated')

patient_info = {'name':'nitish', 'email':'abc@icici.com', 'age': '30', 'weight': 75.2, 'married': True, 'allergies': ['pollen', 'dust'], 'contact_details':{'phone':'2353462'}}

patient1 = Patient(**patient_info) # validation -> type coercion

update_patient_data(patient1)

In [None]:
# Model validator

class Patient(BaseModel):

    name: str
    email: EmailStr
    age: int
    weight: float
    married: bool
    allergies: List[str]
    contact_details: Dict[str, str]

    @model_validator(mode='after')
    def validate_emergency_contact(cls, model):
        if model.age > 60 and 'emergency' not in model.contact_details:
            raise ValueError('Patients older than 60 must have an emergency contact')
        return model



def update_patient_data(patient: Patient):

    print(patient.name)
    print(patient.age)
    print(patient.allergies)
    print(patient.married)
    print('updated')

patient_info = {'name':'nitish', 'email':'abc@icici.com', 'age': '65', 'weight': 75.2, 'married': True, 'allergies': ['pollen', 'dust'], 'contact_details':{'phone':'2353462', 'emergency':'235236'}}

patient1 = Patient(**patient_info) 

update_patient_data(patient1)

In [None]:
# Computed field

class Patient(BaseModel):

    name: str
    email: EmailStr
    age: int
    weight: float # kg
    height: float # mtr
    married: bool
    allergies: List[str]
    contact_details: Dict[str, str]

    @computed_field
    @property
    def bmi(self) -> float:
        bmi = round(self.weight/(self.height**2),2)
        return bmi



def update_patient_data(patient: Patient):

    print(patient.name)
    print(patient.age)
    print(patient.allergies)
    print(patient.married)
    print('BMI', patient.bmi)
    print('updated')

patient_info = {'name':'nitish', 'email':'abc@icici.com', 'age': '65', 'weight': 75.2, 'height': 1.72, 'married': True, 'allergies': ['pollen', 'dust'], 'contact_details':{'phone':'2353462', 'emergency':'235236'}}

patient1 = Patient(**patient_info) 

update_patient_data(patient1)

In [None]:
# Nested model

class Address(BaseModel):

    city: str
    state: str
    pin: str

class Patient(BaseModel):

    name: str
    gender: str
    # gender: str = 'Male'
    age: int
    address: Address

address_dict = {'city': 'gurgaon', 'state': 'haryana', 'pin': '122001'}

address1 = Address(**address_dict)

patient_dict = {'name': 'nitish', 'gender': 'male', 'age': 35, 'address': address1}

patient1 = Patient(**patient_dict)


# serialization

temp = patient1.model_dump(include=['name'])
# temp = patient1.model_dump(include=['name'])
# temp = patient1.model_dump(exclude_unset=True)
print(type(temp))


# Git & Github

In [None]:
# git log
# git log --oneline
# git log --stat
# git log -p
# git show commit_id (for specific commit)
# git diff
# git tag -a V1.0.0 (commit_id)
# git tag -d V1.0.0

# git branch
# git branch (branch_name) (branch will be created at current commit)
# git branch (branch_name) (commit_id) (branch will be created at specific commit)
# git checkout (branch_name)

# git log --oneline --all
# git log --oneline --all --graph

# Add all changes made to tracked files & commit
# git commit -am "commit message"


# git branch -D (branch_name)
# git merge (branch_name)
# Merge & squash all commits into one new commit
# git merge --squash a

# List all local branches.
# ● Add -r flag to show all remote branches.
# ● Add -a flag for all branches.
# git branch -r
# git branch -a

# git reset --hard 1cd80c6
# git push origin master --force

# -- Undoing changes
# editing the last commit message
# forgot to add some files to the last commit
# rolling back to a specific state using show
# revert a commit

## Viewing Commit History
- `git log`: Show the commit history.
- `git log --oneline`: Display a simplified commit history in one line per commit.
- `git log --stat`: Show commit history with file changes statistics.
- `git log -p`: Display the commit history with patch differences.

## Viewing a Specific Commit
- `git show commit_id`: Show details of a specific commit using its ID.

## Comparing Changes
- `git diff`: Compare changes in the working directory or between commits.

## Tagging Commits
- `git tag -a V1.0.0`: Create an annotated tag `V1.0.0` at the current commit.
- `git tag -a V1.0.0 commit_id`: Create an annotated tag `V1.0.0` at a specific commit.
- `git tag -d V1.0.0`: Delete the tag `V1.0.0`.

## Working with Branches
- `git branch`: List all local branches.
- `git branch branch_name`: Create a new branch `branch_name` at the current commit.
- `git branch branch_name commit_id`: Create a new branch `branch_name` at a specific commit.
- `git checkout branch_name`: Switch to the branch `branch_name`.

## Advanced Log Views
- `git log --oneline --all`: Show a simplified commit history for all branches.
- `git log --oneline --all --graph`: Display a graphical representation of the commit history for all branches.

## Committing Changes
- `git commit -am "commit message"`: Add all changes to tracked files and commit with a message.

## Merging Branches
- `git branch -D branch_name`: Delete the branch `branch_name` forcefully.
- `git merge branch_name`: Merge `branch_name` into the current branch.
- `git merge --squash branch_name`: Squash and merge all commits from `branch_name` into one new commit.

## Listing Branches
- `git branch -r`: List all remote branches.
- `git branch -a`: List all local and remote branches.

## Resetting and Pushing Changes
- `git reset --hard commit_id`: Reset the working directory to a specific commit, discarding all changes.
- `git push origin master --force`: Forcefully push changes to the master branch on the remote repository.

## Undoing Changes
- Editing the last commit message:
  - `git commit --amend`: Edit the last commit message or add changes to the last commit.
- Forgot to add some files to the last commit:
  - `git add <file>` then `git commit --amend`: Add the files and amend the last commit.
- Rolling back to a specific state:
  - `git show commit_id`: View the specific state of the commit to which you want to roll back.
- Revert a commit:
  - `git revert commit_id`: Create a new commit that undoes the changes of a specific commit.


# SQL

# MLOps

## Project Workflow with DVC

In [None]:
# 1) git init
# 2) dvc init
# 3) dvc add data
# 4) dvc add models
# - Dvc will add some files in the structure , and also do some changes in .gitignore file
# 4) git add . (To staged all files like .gitignore, data.dvc , models.dvc , dvc.lock)
# 5) git commit -m "initial commit"
# - do some changes in data , some changes will happen in data.dvc

# 6) git add . ( Stage changed file data.dvc)
# 7) git commit -m "second commit"
# 8) git checkout first commit
# 9) dvc checkout (will bring back the data of the first commit)

# dvc add data == dvc commit
# dvc checkout == dvc pull ( yes it is for remote but it works locally as well)

### Initial Setup
1. `git init`: Initialize a new Git repository.
2. `dvc init`: Initialize a new DVC repository.

### Adding Data and Models
3. `dvc add data`: Track the `data` directory with DVC.
4. `dvc add models`: Track the `models` directory with DVC.
   - DVC will add some files in the structure and make changes in the `.gitignore` file to avoid tracking large files directly with Git.

### Committing Changes to Git
5. `git add .`: Stage all files, including `.gitignore`, `data.dvc`, `models.dvc`, and `dvc.lock`.
6. `git commit -m "initial commit"`: Commit the staged files with a message.

### Modifying and Committing Changes
- Make some changes in the data, which will update the `data.dvc` file.
7. `git add .`: Stage the changed file `data.dvc`.
8. `git commit -m "second commit"`: Commit the changes with a message.

### Checking Out Previous Versions
9. `git checkout commit_id`: Check out the first commit or a specific commit.
10. `dvc checkout`: Restore the data to the state of the checked-out commit.

### Equivalent Commands
- `dvc add data` is equivalent to `dvc commit` for tracking changes.
- `dvc checkout` is similar to `dvc pull`, which can be used for remote data but works locally as well.


## Workflow for Using DVC for Data Versioning and ML Pipeline

### Initial Setup
1. `git init`: Initialize a new Git repository.
2. `dvc init`: Initialize a new DVC repository.

### Adding and Tracking Data
3. `dvc add data`: Track the `data` directory with DVC.
   - DVC will add files such as `data.dvc` and update `.gitignore` to avoid tracking the actual data files with Git.

### Adding and Tracking Models
4. `dvc add models`: Track the `models` directory with DVC.
   - Similar to the data, DVC will create `models.dvc` and make necessary changes in `.gitignore`.

### Committing Changes to Git
5. `git add .`: Stage all files, including `.gitignore`, `data.dvc`, `models.dvc`, and `dvc.lock`.
6. `git commit -m "initial commit"`: Commit the staged files with a message.

### Creating and Managing ML Pipelines
7. Create a DVC pipeline:
   - Create a `dvc.yaml` file to define stages of the pipeline (e.g., data preprocessing, training, evaluation).
   - Example:
     ```yaml
     stages:
       preprocess:
         cmd: python src/preprocess.py
         deps:
           - src/preprocess.py
           - data/raw
         outs:
           - data/processed
       train:
         cmd: python src/train.py
         deps:
           - src/train.py
           - data/processed
         outs:
           - models/model.pkl
       evaluate:
         cmd: python src/evaluate.py
         deps:
           - src/evaluate.py
           - models/model.pkl
         outs:
           - metrics/metrics.json
     ```

### Running the Pipeline
8. `dvc repro`: Reproduce the pipeline stages, running commands specified in `dvc.yaml` and tracking outputs.

### Versioning Data and Models
- After making changes to the data or models:
  1. `dvc add data`: Update the tracking of the `data` directory.
  2. `dvc add models`: Update the tracking of the `models` directory.
  3. `git add .`: Stage the updated DVC files.
  4. `git commit -m "update data and models"`: Commit the changes.

### Checking Out Previous Versions
- To revert to a previous state:
  1. `git checkout commit_id`: Check out a specific commit.
  2. `dvc checkout`: Restore the data and models to the state of the checked-out commit.

### Pulling Data from Remote Storage
- If using remote storage for data:
  1. `dvc remote add -d myremote s3://mybucket/path`: Add and set a default remote storage.
  2. `dvc push`: Push data and models to the remote storage.
  3. `dvc pull`: Pull data and models from the remote storage to your local workspace.

### Equivalent Commands
- `dvc add data` == `dvc commit`: Track changes in data.
- `dvc checkout` == `dvc pull`: Restore or pull data to a specific state, working locally or from remote.


## Virtual Environment

### Create a virtual environment
python -m venv myenv

### Activate the virtual environment (Linux/macOS)
source myenv/bin/activate

### Activate the virtual environment (Windows)
myenv\Scripts\activate.bat

### Deactivate the virtual environment
deactivate

### Output the list of installed packages and their versions to a file
pip freeze > requirements.txt

### Install the packages listed in the requirements.txt file
pip install -r requirements.txt

cookiecutter -c v1 https://github.com/drivendata/cookiecutter-data-science

docker :
  docker build -t demo .
  docker run -p 8080:8000 demo

## Capstone_MLOps

1) **Initial Commit:**
   - Project setup using Cookiecutter for standardized project structure.

2) **EDA Complete:**
   - Initialized DVC (`dvc init`) to manage data versions.
   - Data placed in the `data` folder; exploratory data analysis (EDA) completed in Jupyter Notebook.

3) **Modular Programming Complete:**
   - Refactored code into modular components for better maintainability and reusability.

4) **FastAPI Frontend Complete:**
   - Developed a FastAPI frontend for interacting with the backend services.

5) **Dockerization Complete:**
   - Containerized the application using Docker, ensuring it runs consistently in different environments.

6) **DVC Pipeline:**
   - Created a DVC pipeline to automate data processing and model training workflows.

7) **DVC Experiment Tracking Complete:**
   - Set up DVC experiment tracking on a new branch to monitor model performance.

8) **MLflow Experiment Tracking Complete:**
   - Integrated MLflow for experiment tracking, returning to the master branch.

9) **CI Complete:**
   - Implemented continuous integration (CI) to automate testing and deployment.
   - Configured CI to read data from AWS S3, integrated with GitHub Actions.

10) **CI + ECR:**
    - Deployed the Docker images to AWS Elastic Container Registry (ECR) for scalable distribution.

11) **CI + ECR + CD (EC2):**
    - Set up continuous deployment (CD) to automatically deploy updates to an EC2 instance.

12) **Kubernetes + Seldon:**
    - Deployed the machine learning model on Kubernetes using Seldon for scalable and reliable serving.

In [None]:
# Flow (commands)

## Docker

## MLflow

# Sklearn

# Pytorch

# Langchain

Concepts : Q & A , RAG, Tools, Agents

Core Components : Model, Prompts, Chain, Memory, Indexes, Agents
                indexes : Doc loader, Text splitter , Vector store, Retriver

Langchain Feature : Structured Output, Runnables

----------------

Structured Output : LLM may or may not capable
  - capable : with_structured_output (typeddict, pydantic, json_schema)
                  - typeddict (define, no validation, kind of hint, annotated to give description)
                  - pydantic (basemodel, validation , field for description and other control)
                  - json_schema (across language, data validation and structure schema)

  - not capable : output parser: workes with both kind of model (strout, jsonout, structuredout, pydanticout)
                  - parser

Chain : Sequeantial, Parallel, Conditional

Runnables :   
    - in early days, chain development(for specific tasks and usecases)
     - runnable (standardize the components to make chain development easy)
     - standardized components (task specific and primitive runnables)
     - Task Specific : Core components converted to runnables
     - Primitive runnables : Execution logic to connect other runnable(sequance/LCEL, parallel, passthrough, lambda, branch)


RAG : Document loader, Text splitter, Vector store, Retriver

Doc Loader: text loader, pypdf(manyother types), webbased, csv, directory loader, lazy load(large file)

Text splitter : length based, text structure , document structure(languages), semantic meaning

Tools : Tool -> Agents
  - Tools (built in & custom) --> Simple python function with langchain wrapper
  - Custom Tool : Three way to make (@tool, StructuredTool(Pydantic), BaseTool(abstract class of tool))
  - Tool Binding (Not every LLM is capable), Tool Calling, Tool Execution

Agent
 - Langchain AgentExecutor (old approach)
 - Langgraph (Latest Approach)