#### **Phase 6 guide (Deployment Plan)**

**1. Plan and implement deployment**  

- The project successfully deployed two models **Machine Learning Model 1(XGBoost) and Deep Learning Model 2(Multi-layer Perceptron)** on **Google Cloud IDX**. The backend was built using **FastAPI**, which handles API requests for model predictions.   
- The frontend interface (car_prediction.html) is a simple and user-friendly web form that allows users to input car features for prediction. This setup is suitable for testing and initial user feedback.
- This stack (HTML, Javascript + Python/FastAPI) is lightweight and ideal for rapid prototyping and demonstration.

**2. Plan monitoring**
- The system is monitored through **FastAPI** server logs. User feedback from trial sessions will also help assess model performance and accuracy. 
- In the future, tools like Prometheus and Grafana can be integrated to monitor key metrics such as API response time, request volume, and errors.

**3. Plan maintenance**  
The system will undergo regular checks and updates, including:
- Updating models with new data.
- Adjusting hyperparameters if needed.
- Ensuring backend compatibility with FastAPI updates.
- Regular backups of data and model files.

**4. Produce a final report**  
- The final report includes:
- The model development process.
- Evaluation results of Model 1 and Model 2.
- Description and screenshots of the user interface.
- Deployment architecture.
- Recommendations for improvement and scalability plans.

**5. Review project**  
The project has completed the core stages - CRISP-DM: Business Understanding, Data Understanding, Data Processing, Model Training, Evaluation, and Deployment.  
Both models were tested in a live environment. 
Deployment on Google Cloud IDX demonstrated the system's feasibility and potential for further development.

#### **Additional questions:**
**1. What production service platform will be used?**
- The system is deployed on Google Cloud IDX, which is suitable for development, testing, and small-scale deployment.

**2. How to upload a model to a production server?**
- The project uses two machine learning models: one built with XGBoost and another with Keras (a neural network model). These models are uploaded to the production server as follows:

- The XGBoost model is saved in .model format and loaded into the FastAPI application using xgb.Booster().load_model('xgb_price_prediction.model').

- The Keras model is saved in .keras format and loaded using load_model('mlp_model.keras').

- Additionally, a scaler for input normalization is stored as a .pkl file and loaded using joblib.load('scaler_price_prediction.pkl').

- All these model files (.model, .keras, .pkl) are placed in the backend directory on the production server. When the FastAPI app starts, it loads these models into memory, making them ready to serve predictions via API endpoints.


**3. How to upload new data to the production server?**
- In this project, new data is submitted through the web form (car_prediction.html) by the user. When the form is filled and submitted, the frontend sends the input data to the FastAPI backend through an HTTP request.  
The backend then:
    - Receives the data,
    - Scales it using the pre-loaded scaler (if using the neural network model),
    - Passes it to the appropriate ML model (XGBoost or Keras),
    - Returns the predicted car price to the frontend.
- In future improvements, additional data upload options can be added, such as:
    - Uploading CSV files containing multiple car records.
    - Connecting to external APIs or databases to fetch real-time car data for batch prediction or retraining.

**4. What kind of user interface will be needed?**
- The current user interface is a simple web form where users input car features (e.g., number of seats, fuel type, power, year). It is designed to be intuitive and accessible for general users.
- In the future, the user interface will be enhanced with data visualization features, allowing users to:
    - View predicted car prices in graphical form.
    - Visualize comparison between different prediction models (e.g., Model 1 vs Model 2) using charts or graphs.
    - Display trends, such as average prices over time or correlation between features and price predictions, through interactive visualizations.  

**5. Who will use the model?**  
Potential users include:

- General users who want to estimate the resale value of a car.
- Car dealerships or marketplaces looking to integrate a quick pricing tool.
- Students and instructors using it as a real-world machine learning application demo.

In [42]:
import pandas as pd
import numpy as np
import xgboost as xgb

model = xgb.Booster()
model.load_model('../Data/xgb_price_prediction.model')
print(model.feature_names)

['engine_type', 'fuel_type', 'transmission', 'body_type', 'has_incidents', 'wheel_system', 'horsepower', 'maximum_seating', 'mileage', 'torque', 'year', 'combined_fuel_economy', 'legroom', 'major_options_count', 'size_of_vehicle']


#### **Attribution of Feature Value - sample data**

1. **engine_type**: The engine configuration. Eg: I4, V6, etc.
2. **fuel_type**: Dominant type of fuel ingested by the vehicle.
3. **transmission**: Type of transmission, such as Automatic, Manual, etc.
4. **body_type**: Body Type of the vehicle. Like Convertible, Hatchback, Sedan, etc.
5. **has_incidents(has_accidents)**: Whether the vin has any accidents registered.
6. **wheel_system**: Traction system of a vehicle, such as AWD or FWD.
7. **horsepower**: Horsepower is the power produced by an engine.
8. **maximum_seating**: Total number of seats.
9. **mileage**: Refers to the distance that the vehicle has travelled, measured in miles.
10. **torque**: Torque indicates the force to which the drive shaft is subjected. Also the revolutions needed to reach the maximum torque.
11. **year**: The year the car was built.  
12. **combined_fuel_economy**:  (city_fuel_economy + highway_fuel_economy)/2
    - **city_fuel_economy**: Fuel economy in city traffic in km per litre.
    - **highway_fuel_economy**: Highway fuel economy in km per litre.

13. **legroom**: combine : front_legroom = legroom in the passenger seat and legroom in rear seat measured in inches.  
    - cars_df['legroom'] = cars_df['front_legroom'] + cars_df['back_legroom']
14. **major_options_count**: Count optional packages of the vehicle.
15. **size_of_vehicle**: size = length + width + height + wheelbase + fuel_tank_volume


In [None]:

file_path = "D:/JAMK/S2/AIDA-Project/used_cars_data/used_cars_data.csv"
df_org = pd.read_csv(file_path)

  df_org = pd.read_csv(file_path)


In [43]:
pd.set_option('display.max_columns', None)
#LOAD SAMPLE DATA FOR TEST
df_org[['price','engine_type', 'fuel_type', 'transmission', 'body_type', 'has_accidents', 'wheel_system', 'horsepower', 'maximum_seating', 'mileage', 'torque', 'year', 'city_fuel_economy','highway_fuel_economy', 'front_legroom','back_legroom', 'major_options','length','width', 'height', 'wheelbase' ,'fuel_tank_volume']].sample(5)

Unnamed: 0,price,engine_type,fuel_type,transmission,body_type,has_accidents,wheel_system,horsepower,maximum_seating,mileage,torque,year,city_fuel_economy,highway_fuel_economy,front_legroom,back_legroom,major_options,length,width,height,wheelbase,fuel_tank_volume
2326013,45526.0,V8 Flex Fuel Vehicle,Flex Fuel Vehicle,A,SUV / Crossover,False,4WD,355.0,8 seats,33903.0,"383 lb-ft @ 4,100 RPM",2017,,,45.3 in,39 in,"['Leather Seats', 'Navigation System', 'Suspen...",204 in,80.5 in,74.4 in,116 in,26 gal
2304227,20982.0,I4,Gasoline,CVT,Sedan,,FWD,188.0,5 seats,7.0,"180 lb-ft @ 3,600 RPM",2020,28.0,39.0,43.8 in,35.2 in,,192.9 in,72.9 in,56.7 in,111.2 in,16.2 gal
656107,99990.0,V6,Gasoline,A,Sedan,False,AWD,362.0,5 seats,,,2020,18.0,28.0,41.4 in,34.1 in,"['Sport Package', 'Leather Seats', 'Driver Ass...",206.9 in,83.9 in,58.8 in,124.6 in,21.1 gal
1862331,37995.0,I6 Diesel,Diesel,A,Pickup Truck,False,4WD,350.0,5 seats,89080.0,"650 lb-ft @ 1,500 RPM",2011,,,41 in,45.3 in,"['Leather Seats', 'Sunroof/Moonroof', 'Navigat...",248.4 in,97 in,78.3 in,160 in,34 gal
2087261,37376.0,,,A,Van,,,,,48.0,,2019,,,,,,,,,,
