<a href="https://colab.research.google.com/github/SUDHARSSHINI/AUTO_BOM/blob/main/Auto_BOM_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Data Loading**


In [None]:
import pandas as pd

# Load the dataset
data = pd.read_csv('components_data.csv')
print(data.head())


**Data Preprocessing**

In [None]:
# Check for missing values
print(data.isnull().sum())

# Fill missing values (forward fill, mean, or drop)
data.fillna(method='ffill', inplace=True)


In [None]:
# target variable
X = data.drop('cylinders', axis=1)
y = data['cylinders']


In [None]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)


**Training**

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)
from sklearn.ensemble import RandomForestClassifier



**ML Algorithm**

In [None]:

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)


**Logic**
Objective: Understand the logic behind the steps taken in the machine learning workflow.

Steps:
Load and Explore the Dataset:

Understand the dataset's structure, data types, and features.
Clean and Transform Data:

Handle missing values to avoid bias in the model.
Encode categorical variables to allow the model to interpret them as numerical input.
Feature and Target Separation:

Separate the features that influence the selection from the target variable, which is the component you want to predict.
Scale Features:

Scaling helps improve model convergence and performance, especially for algorithms sensitive to the scale of input features.
Model Training:

Split the data to ensure that the model can be tested on unseen data, providing an estimate of how it will perform in real-world scenarios.
Train the model using the training set, allowing it to learn patterns and relationships in the data.
Model Evaluation:

After training, evaluate the model's performance on the test set using accuracy and other metrics to ensure it generalizes well.
Make Predictions:

Use the trained model to predict component selection based on new data.

**Extract Data from PDF file**

In [None]:
!pip install PyPDF2
import os
import PyPDF2
import pandas as pd
from google.colab import drive

drive.mount('/content/drive')

# Function to read PDFs
def read_pdf(file_path):
    with open(file_path, 'rb') as f:
        reader = PyPDF2.PdfReader(f)
        text = ""
        for page in reader.pages:
            text += page.extract_text()
        return text

# Function to extract relevant information from PDF (IR Sensor example)
def extract_ir_sensor_info(text):
    # Keywords or sections to look for in the PDF
    keywords = ["IR Sensor", "Proximity Sensor", "Detection Range", "Supply Voltage", "Temperature Range"]

    # Split text into lines for easier processing
    lines = text.split('\n')

    # Store relevant information
    sensor_info = {}

    for line in lines:
        # Check for keywords and extract relevant lines
        for keyword in keywords:
            if keyword in line:
                sensor_info[keyword] = line

    return sensor_info

# Example usage
pdf_path = '/content/drive/MyDrive/Project/ADNA Automation/Electrical component data/omran plc.pdf'
pdf_path = '/content/drive/MyDrive/Project/ADNA Automation/Electrical component data/SENSOR /IR SENSOR.pdf'
pdf_text = read_pdf(pdf_path)
sensor_info = extract_ir_sensor_info(pdf_text)

# Display extracted sensor info
for key, value in sensor_info.items():
    print(f"{key}: {value}")


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
IR Sensor:  Specifications of IR Sensors for Industrial Automation  
Detection Range: 2. Detection Range : 
Temperature Range: 5. Operating Temperature Range : 
Supply Voltage: 6. Supply Voltage : 


In [None]:
# Function to select components based on extracted information
def select_components(sensor_info):
    # Define the criteria for component selection
    bom_criteria = {
        "IR Sensor Type": "Proximity Sensor",
        "Detection Range": "80 cm",
        "Supply Voltage": "12V to 24V",
        "Temperature Range": "-20°C to 70°C"
    }

    # Check if the extracted sensor info matches the criteria
    selected_components = {}

    for key, value in bom_criteria.items():
        if key in sensor_info and value in sensor_info[key]:
            selected_components[key] = sensor_info[key]

    return selected_components

# Example usage
selected_bom = select_components(sensor_info)

# Display the selected components for the BOM
print("Selected Components for BOM:")
for key, value in selected_bom.items():
    print(f"{key}: {value}")


Selected Components for BOM:


Storing it in csv file

In [None]:
# Function to generate BOM and save to CSV
def save_bom_to_csv(selected_bom, file_name='BOM.csv'):
    df = pd.DataFrame(list(selected_bom.items()), columns=['Component', 'Details'])
    df.to_csv(file_name, index=False)
    print(f"BOM saved to {file_name}")

# Example usage
save_bom_to_csv(selected_bom)


BOM saved to BOM.csv
