# StudentPerformancePredictionML
### This project understands how the student's performance (test scores) is affected by other variables such as Gender, Ethnicity, Parental level of education, Lunch and Test preparation course.

*******************************
*******************************
<br>

## Main Folder 👇

-----------------------
<br>

### setup.py 👇
<i><b>(/StudentPerformancePredictionML/setup.py)</b></i>

The `setup.py` is responsible for creating my machine learning project as a package


```python
# Import Libraries
from setuptools import find_packages,setup
from typing import List
```

```python
setup(
name='StudentPerformancePredictionML',   # Name of project
version='0.0.1',   # Version 
author='Ninad',    # Author name
author_email='ninad.karlekar@vsit.edu.in',  # Author email-id
packages=find_packages(),
install_requires=get_requirements('requirements.txt') # Install all packages in file
)
# When you use `find_packages()` in your setup.py file, it will search the current directory and its subdirectories for any packages that contain an __init__.py file,
```

```python
HYPEN_E_DOT='-e .'
def get_requirements(file_path:str)->List[str]:
    '''
    this function will return the list of requirements
    '''
    requirements=[]
    with open(file_path) as file_obj:
        # Read all packages written in requirements.txt
        requirements=file_obj.readlines()
        # replace /n with blank
        requirements=[req.replace("\n","") for req in requirements] 

        # Remove -e . from requirements.txt while reading
        if HYPEN_E_DOT in requirements:
            requirements.remove(HYPEN_E_DOT)   
    
    return requirements
```

-----------------------
<br>

### requirements.txt 👇
<i><b>(/StudentPerformancePredictionML/requirements.txt)</b></i>

`requirements.txt` file will have all packages that we will use while implimenting project

```python
pandas
numpy
seaborn
-e .                            # -e . will automatically trigger setup.py
```

*******************************
*******************************
<br>

## src Folder 👇
<i><b>(/StudentPerformancePredictionML/src)</b></i>

This folder will contain all the code

---------------------------

<br>

#### exception.py 👇
<i><b>(/StudentPerformancePredictionML/src/exception.py)</b></i> <br><br>

- We will write own **custom exception** here



```python
import sys
from src.logger import logging

# When a exception is raised we will push our own custon exception
def error_message_detail(error,error_detail:sys):
    _,_,exc_tb=error_detail.exc_info()
    file_name=exc_tb.tb_frame.f_code.co_filename
    error_message="Error occured in python script name [{0}] line number [{1}] error message[{2}]".format(file_name,exc_tb.tb_lineno,str(error))

    return error_message

class CustomException(Exception):
    def __init__(self,error_message,error_detail:sys):
        super().__init__(error_message)
        self.error_message=error_message_detail(error_message,error_detail=error_detail)
    
    def __str__(self):
        return self.error_message
    
```

---------------------------

<br>

#### logger.py 👇
<i><b>(/StudentPerformancePredictionML/src/logger.py)</b></i> <br><br>

- Any execution that is happning we should be able to **log** all information


```python

import logging
import os
from datetime import datetime

LOG_FILE=f"{datetime.now().strftime('%m_%d_%Y_%H_%M_%S')}.log"
logs_path=os.path.join(os.getcwd(),"logs",LOG_FILE)
os.makedirs(logs_path,exist_ok=True)

LOG_FILE_PATH=os.path.join(logs_path,LOG_FILE)

logging.basicConfig(
    filename=LOG_FILE_PATH,
    format="[ %(asctime)s ] %(lineno)d %(name)s - %(levelname)s - %(message)s",
    level=logging.INFO,
)

```

**To Test logger.py,**
- Write following code at end of `logger.py`
    ```python
        if __name__ == "__main__":
        logging.info("Logging started TEST LOGGING")
    ```
    
- Write following command in `terminal`
    ```bash
        python src/logger.py
    ```
    
- You can see logs in `logs folder`

---------------------------

<br>

#### utils.py 👇
<i><b>(/StudentPerformancePredictionML/src/utils.py)</b></i> <br><br>




<br>

---------------------------

### components Folder 👇
<i><b>(/StudentPerformancePredictionML/src/components)</b></i>

<br>

---------------------------

#### data_ingestion.py 👇
<i><b>(/StudentPerformancePredictionML/src/components/data_ingestion.py)</b></i> <br><br>
This will have all code related to reading data

```python
import os
import sys
from src.exception import CustomException # Import exception form src->exception
from src.logger import logging # Import logging form src->logger
import pandas as pd

from sklearn.model_selection import train_test_split
from dataclasses import dataclass # Used to create class variables

from src.components.data_transformation import DataTransformation
from src.components.data_transformation import DataTransformationConfig

from src.components.model_trainer import ModelTrainerConfig
from src.components.model_trainer import ModelTrainer
@dataclass
class DataIngestionConfig:
    train_data_path: str=os.path.join('artifacts',"train.csv") # All train data will be saved in this path
    test_data_path: str=os.path.join('artifacts',"test.csv") # All test data will be saved in this path
    raw_data_path: str=os.path.join('artifacts',"data.csv") # All raw data will be saved in this path

class DataIngestion:
    def __init__(self):
        self.ingestion_config=DataIngestionConfig() # above 3 path will bes stored inside class

    def initiate_data_ingestion(self):
        logging.info("Entered the data ingestion method or component") 
        try:
            df=pd.read_csv('notebook\data\stud.csv') # Read from csv
            logging.info('Read the dataset as dataframe')

            os.makedirs(os.path.dirname(self.ingestion_config.train_data_path),exist_ok=True) 

            df.to_csv(self.ingestion_config.raw_data_path,index=False,header=True)

            logging.info("Train test split initiated")
            train_set,test_set=train_test_split(df,test_size=0.2,random_state=42)

            # After split save train data to train_data_path
            train_set.to_csv(self.ingestion_config.train_data_path,index=False,header=True) 

            # After split save test data to test_data_path
            test_set.to_csv(self.ingestion_config.test_data_path,index=False,header=True)

            logging.info("Inmgestion of the data iss completed")

            return(
                # passing train_data_path & test_data_path to next step(Data transformation)
                self.ingestion_config.train_data_path,
                self.ingestion_config.test_data_path

            )
        except Exception as e:
            raise CustomException(e,sys)
        
if __name__=="__main__":
    obj=DataIngestion()
    train_data,test_data=obj.initiate_data_ingestion()

    data_transformation=DataTransformation()
    train_arr,test_arr,_=data_transformation.initiate_data_transformation(train_data,test_data)

    modeltrainer=ModelTrainer()
    print(modeltrainer.initiate_model_trainer(train_arr,test_arr))
```

<br>

---------------------------

#### data_transformation.py 👇
<i><b>(/StudentPerformancePredictionML/src/components/data_transformation.py)</b></i> <br><br>

- This will have all code related to transforming data
- Here we will write code such as:- how to change categorical feature into numerical feature ,How to handle lable encoding

<br>

---------------------------

#### model_trainer.py 👇
<i><b>(/StudentPerformancePredictionML/src/components/model_trainer.py)</b></i> <br><br>

- This will have all code related to training data

*******************************
*******************************
<br>

## pipeline Folder 👇
<i><b>(/StudentPerformancePredictionML/pipeline)</b></i>

---------------------------

<br>

#### train_pipeline.py 👇
<i><b>(/StudentPerformancePredictionML/src/pipeline/train_pipeline.py)</b></i> <br><br>

- This will have all code related to training pipeline
- From this training pipeline we will call all components files


---------------------------

<br>

#### predict_pipeline.py 👇
<i><b>(/StudentPerformancePredictionML/src/pipeline/predict_pipeline.py)</b></i> <br><br>

- This will have all code related to prediction of new data


--------------------------
**END**
