Excellent! Now that you've shared your dataset, here's **how to practice both Linear and Logistic Regression** with your real petrol data.

---

## ✅ Step-by-Step Practice Guide

### 📌 Dataset Overview

You've got 10 rows and 11 columns. Let's rename columns slightly to simplify programming (replace spaces with underscores):

| Column Name (Cleaned)            | Meaning                       |
| -------------------------------- | ----------------------------- |
| `Daily_Oil_Consumption`          | Target for regression         |
| `Price_Per_Gallon_USD`           | Feature                       |
| `GDP_Per_Capita_USD`             | Feature                       |
| `Gallons_GDP_Per_Capita_Can_Buy` | Feature                       |
| `Yearly_Gallons_Per_Capita`      | Alternative regression target |
| `High_Consumption` (derived)     | Target for classification     |

---

## 🔷 LINEAR REGRESSION

We’ll predict **Daily Oil Consumption (Barrels)** based on GDP and price:

### ▶ Code:

```python
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Load data
data = pd.read_csv("your_file.csv")  # or use DataFrame directly
data.columns = data.columns.str.replace(' ', '_')

# Features and target
X = data[['Price_Per_Gallon_(USD)', 'GDP_Per_Capita_(USD)']]
y = data['Daily_Oil_Consumption_(Barrels)']

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train linear regression
lr_model = LinearRegression()
lr_model.fit(X_train, y_train)
pred = lr_model.predict(X_test)

# Evaluation
print("MSE:", mean_squared_error(y_test, pred))
print("Predicted:", pred)
```

---

## 🔷 LOGISTIC REGRESSION

We will create a binary column:
`High_Consumption = 1` if oil consumption > 4 million barrels, else `0`.

### ▶ Code:

```python
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix

# Create binary target
data['High_Consumption'] = (data['Daily_Oil_Consumption_(Barrels)'] > 4000000).astype(int)

X = data[['Price_Per_Gallon_(USD)', 'GDP_Per_Capita_(USD)']]
y = data['High_Consumption']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train logistic regression
log_model = LogisticRegression()
log_model.fit(X_train, y_train)
log_pred = log_model.predict(X_test)

# Evaluation
print("Accuracy:", accuracy_score(y_test, log_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, log_pred))
```

---

## 🔍 Analysis Questions You Can Explore

| Question                                                 | Technique            |
| -------------------------------------------------------- | -------------------- |
| Which countries consume more oil based on GDP and price? | Linear Regression    |
| Can we classify countries as high/low consumers?         | Logistic Regression  |
| Does price strongly affect oil use?                      | Coefficient analysis |