### Initial Project Setup

- Run `template.py` to create file and folder structure
- Write code in setup.py and pyproject.toml files
- Create venv and install requirements.txt


### Mongodb setup

- signIn to mongodb atlas
- create project & test cluster
- create dbs user
- (iamprashantjain2601 - kjRszbweibg7Y5ZB)
- navigate to network access and add ip address: 0.0.0.0/0 to access it from anywhere
- go back to project and hit "connection string" - select driver python > 3.6
- save connection string: `mongodb+srv://iamprashantjain2601:kjRszbweibg7Y5ZB@cluster0.tqujt.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0`
- create `experiments` folder and save data.csv file + create a new file: mongodb_demo.ipynb to send data to mongodb
- after pushing data to mongodb, check if its successfully uploaded

![image.png](attachment:image.png)


### What is setup.py & pyproject.toml file

- `What is a pyproject.toml file?`
- TOML (Tom’s Obvious, Minimal Language): It’s a simple configuration file format (like JSON or YAML) but is easier to read and write. 
- TOML is becoming the standard for Python packaging metadata.

- `Why pyproject.toml is important:`
- It was introduced with PEP 518 to modernize Python package building. Previously, everything was done using setup.py 
- but now pyproject.toml allows for more flexibility, better dependency management, and cleaner project configuration.
- It centralizes metadata about the project: project name, version, dependencies, authors, etc.
- It supports various build systems (like setuptools, poetry, etc.).

- `Explaining sections of pyproject.toml:`
- [project]: Defines the basic project information (name, version, description, authors).
- [tool.setuptools]: Specifies that setuptools is being used to build the project.
- [tool.setuptools.dynamic]: Links the external files (like requirements.txt) to dynamically pull dependencies.

- setup.py with the advent of pyproject.toml: Some tasks previously handled by setup.py (like metadata) are now managed by pyproject.toml. However, setup.py can still be used, especially if you have complex build steps.

- `How do setup.py, pyproject.toml, and requirements.txt work together?`
- pyproject.toml: It’s now the central place for project metadata. Instead of defining your dependencies and project information in setup.py, you can define them in pyproject.toml.
- As we did in your project, the line [tool.setuptools.dynamic] dependencies = {file = "requirements.txt"} links your requirements.txt file to the TOML file, so when the project is built, the dependencies are fetched from requirements.txt.

- setup.py: While it’s still used for custom builds and configurations, most of the basic functionality (like metadata and dependencies) is being transferred to pyproject.toml. You might still keep a minimal setup.py if you have custom build steps, but for many projects, it’s not necessary anymore with pyproject.toml.

- requirements.txt: It lists all project dependencies and their versions.
- When you run pip install -r requirements.txt, it ensures that all dependencies are installed. The pyproject.toml file can reference it (as we did) so that package dependencies are automatically pulled from there.


### Setup Logger & Exception module

- write code for logging & exception module and test on demo.py

### Upload experiment.ipynb file

- Perform complete project on jupyter to make it easy while converting into modular coding

## Data Ingestion module
- before data ingestion module, declare variables in `constants` init file
- create a file `mongo_db_connection.py` to establish mongodb connection
- create a file `proj1_data.py` inside data_access folder to connect to mongodb and fetch data
- write code in config_entity and artifacts_entity
- set mongodb enviromental variable:
    
    + ![image-2.png](attachment:image-2.png)

    + to set: $env:MONGODB_URL = "mongodb+srv://iamprashantjain2601:kjRszbweibg7Y5ZB@cluster0.tqujt.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0"
    + to check: echo $env:MONGODB_URL

- write `modular code data_ingestion.py`
    1. declare constant variables
    2. write config entity
    3. write artifact entity
    4. write data_ingestion.py
    5. write code in prediction_pipeline.py
    6. write code in app.py/demo.py
- test in demo.py

![image-3.png](attachment:image-3.png)


## Data Validation module
- create `utils/main_utils.py` (common helper functions) & `config/schema.yaml` file (add entire info about dataset for data validation step)
- write `modular code for data_validation`
    1. declare constant variables
    2. write config entity
    3. write artifact entity
    4. write data_validation.py
    5. write code in training_pipeline.py
    6. write code in app.py/demo.py

