**What is Predictive Analytics?**

Predictive Analytics means using data to predict what might happen in the future.

Think of it as a weather forecast. Just like meteorologists use past weather data to predict tomorrow's weather, predictive analytics uses past information (data) to make educated guesses about future events.

# Key Concepts in Predictive Analytics

**Data Collection**

You need data to predict.

Example: A retail store might use customer purchase history to predict future sales.

**Patterns and Trends**

It’s about finding patterns in the data.

Example: If customers usually buy umbrellas when it rains, we can predict umbrella sales will rise if rain is expected.

**Models**

A model is like a recipe. It uses the data to create predictions.

Example: If you know students who study for 2 hours score well, your model might predict higher scores for students studying more.

**Outcomes**

Predictions could be numbers (e.g., sales next week) or categories (e.g., whether a customer will buy or not).

# What is Machine Learning?

Machine Learning (ML) is a smart way to teach computers how to make predictions.

Instead of programming rules for every situation, we let the computer learn patterns from data.

Example: If you show a computer lots of pictures of cats and dogs, it can learn to tell the difference on its own.

# General Methodologies in Predictive Analytics
1. **Define the Problem**
What are you trying to predict?
Example: Predict which customers will leave your service.
2. **Collect Data**
Use historical data.
Example: Customer feedback, sales records, or website activity.
3. **Clean the Data**
Remove errors or incomplete information.
Example: Fix missing values or remove duplicate entries.
4. **Choose the Right Model**
Use tools like regression, decision trees, or neural networks.
Example: A regression model can predict house prices based on location and size.
5. **Train the Model**
Show the model lots of examples so it learns.
Example: Give data of house prices with their features.
6. **Test the Model**
Check if the model works well with new data.
Example: Use the model on recent house sales and compare predictions with actual prices.
7. **Deploy and Monitor**
Use the model in real-world scenarios and improve it over time.
Example: A bank might use it to predict loan defaults.

# Types of Predictive Models

**Regression: Predicts a number.**

Example: How much a car will cost next year.

**Classification: Predicts categories.**

Example: Whether a customer will buy or not (Yes/No).

**Clustering: Groups similar things together.**

Example: Segmenting customers into groups based on their interests.

# Steps of a MACHINE LEARNING model

**Step 1: Define the Problem**
Define the objective: Predict monthly insurance charges based on customer data.

Identify the target variable (charges) and the input features (e.g., age, bmi, etc.).

**Step 2: Data Loading**
Load the dataset into a DataFrame.

**Step 3: Data Exploration**

Display the first few rows of the dataset to understand the structure.

Check for null values and data types.

Generate summary statistics for numerical and categorical columns.

Visualize data distributions and relationships.

**Step 4: Data Cleaning**

Handle missing values:

Fill, replace, or drop missing data.

Remove duplicate rows.

Fix any incorrect or inconsistent values.

**Step 5: Feature Engineering**

Create new features if necessary (e.g., group ages or categories).

Encode categorical variables (e.g., one-hot encoding or label encoding).

Drop unnecessary columns.

**Step 6: Feature Scaling**

Standardize or normalize numerical features to bring them to a similar scale.

**Step 7: Data Splitting**

Split the data into training and testing sets. Use 80% for training and 20% for testing.

**Step 8: Model Selection and Training**

Select a regression model suitable for predicting charges (e.g., Linear Regression, Random Forest, etc.).

Train the model using the training data.

**Step 9: Model Evaluation**

Test the model on the testing data.

Evaluate performance using metrics like Mean Absolute Error (MAE), Mean Squared Error (MSE), and R² score.

**Step 10: Model Tuning**
Optimize the model through hyperparameter tuning (e.g., GridSearchCV).
Experiment with different feature subsets for better results.




In [None]:
'''
Python
https://intellipaat.com/blog/interview-question/python-interview-questions/
ML
https://intellipaat.com/blog/interview-question/machine-learning-interview-questions/
AI
https://intellipaat.com/blog/interview-question/artificial-intelligence-interview-questions/
Deep Learning
https://intellipaat.com/blog/interview-question/deep-learning-interview-questions/
Data science combined
https://intellipaat.com/blog/interview-question/data-science-interview-questions/
SQL
https://intellipaat.com/blog/interview-question/sql-interview-questions/
Data Warehouse
https://intellipaat.com/blog/interview-question/data-warehouse-interview-questions/
Power BI
https://intellipaat.com/blog/interview-question/power-bi-interview-questions/
Excel
https://www.youtube.com/watch?v=TjaiHc7J_Co
'''
