# **<font color='darkorange'>Introduction to Machine Learning</font>**

**Machine Learning** is a subset of Artificial Intelligence (AI) that enables computers to **learn from data** and make decisions or predictions **without being explicitly programmed**. Instead of relying on hard-coded rules, machine learning systems improve their performance as they process more data over time.

<img src="https://drive.google.com/uc?id=1jEUBnfczk1eFom8nS1duQC8cVFEkHvDm" alt="Types" width="700"/>

---
## **<font color='blue'>What is Machine Learning?</font>**

Machine Learning focuses on developing algorithms that can:  
- **Learn** from historical data.  
- **Identify patterns** and relationships within the data.  
- **Predict outcomes** or take actions on new, unseen data.

### **Key Characteristics**:
1. **Data-Driven**: ML systems rely heavily on data to make decisions.  
2. **Iterative**: The more data an ML model processes, the better it gets.  
3. **Adaptive**: ML models can adapt and adjust as the environment or data changes.

---
- [Machine Learning is the] field of study that gives computers the ability to learn
without being explicitly programmed.
— Arthur Samuel, 1959
- A computer program is said to learn from experience E with respect to some task T
and some performance measure P, if its performance on T, as measured by P,
improves with experience E.
— Tom Mitchell, 1997

<div style="display: flex; justify-content: space-around;">
    <img src="https://drive.google.com/uc?id=1-Mdprb3HNf_cbjjHhde4tYCTrWYKhntx" width="500"/>
    <img src="https://drive.google.com/uc?id=1p4gasbgwONGCXcYbOpa59ur2nRuFjw15" width="500"/>
</div>
---

## **<font color='green'>Why is Machine Learning Important?</font>**

Machine Learning is at the core of many modern technologies and applications, including:  
- Personalized recommendations (e.g., Netflix, Amazon).  
- Predictive maintenance in industries.  
- Spam detection in emails.  
- Medical diagnosis and image analysis.  
- Autonomous vehicles and robotics.  

By analyzing vast amounts of data, ML helps businesses and organizations:  
- Automate decision-making.  
- Reduce costs and improve efficiency.  
- Discover insights that humans may overlook.  

---

## **<font color='purple'>Types of Machine Learning</font>**

Machine Learning problems are broadly categorized into three types based on the nature of data and tasks:

1. **<font color='blue'>Supervised Learning</font>**  
   - Learning from **labeled data** (input-output pairs).  
   - Example: Predicting house prices based on size, bedrooms, and location.

2. **<font color='green'>Unsupervised Learning</font>**  
   - Learning from **unlabeled data** to identify patterns.  
   - Example: Grouping customers into segments based on spending habits.

3. **<font color='purple'>Reinforcement Learning</font>**  
   - Learning through **trial and error** to maximize rewards in an environment.  
   - Example: Teaching a robot to navigate a room by rewarding correct moves.

---

## **<font color='darkorange'>Conclusion</font>**

Machine Learning has transformed the way systems interact with data, enabling innovations in **business, healthcare, finance, and technology**. By understanding its core concepts and categories, we can develop powerful solutions to solve real-world problems.

---

Next, let’s explore the **Types of Machine Learning Problems** in detail:


# **<font color='darkorange'>Types of Machine Learning Problems</font>**

Machine Learning problems are broadly categorized into three types based on the nature of data and tasks:

1. **<font color='blue'>Supervised Learning</font>**  
2. **<font color='green'>Unsupervised Learning</font>**  
3. **<font color='purple'>Reinforcement Learning</font>**

Each type of problem involves different goals, algorithms, and use cases.

<img src="https://drive.google.com/uc?id=11fWx1qW0U1UqIoBva5FF6DMooiu0kRLP" alt="Types" width="700"/>/>

## **<font color='blue'>1. Supervised Learning</font>**

**Definition**:  
In supervised learning, the model learns a mapping between **input features** and their corresponding **target labels**. It is called "supervised" because the learning process is guided by labeled data.

### **Goal**:  
- Train a model to make predictions on unseen data based on labeled training data.

### **Key Concepts**:  
- **Input Features (X)**: The independent variables.  
- **Target Labels (Y)**: The dependent variables (known outcomes).  
- **Training Data**: Labeled data used to train the model.  
- **Testing Data**: Data used to evaluate the model's performance.


### **Types**:
1. **<font color='darkcyan'>Classification</font>**: Predicts a **categorical label**.  
   - Example: Predict whether an email is "Spam" or "Not Spam" (binary classification).  
   - Example: Classify handwritten digits into classes (0–9).

2. **<font color='darkcyan'>Regression</font>**: Predicts a **continuous value**.  
   - Example: Predict the price of a house based on its size and location.  
   - Example: Forecast the temperature for the next day.


## <font color='blue'>**Supervised Learning Example Data**</font>

In supervised learning, the data has **input features** (\(X\)) and corresponding **labels** (\(Y\)).

### <font color='green'>**Classification Problem**</font>
Predict whether a customer will purchase a product (<font color='red'>**Yes/No**</font>) based on their age and income.

| **Customer ID** | **Age (Years)** | **Income (USD)** | **Purchased (Y/N)** |
|-----------------|-----------------|-----------------|---------------------|
| 1               | 25              | 50,000          | <font color='green'>Yes</font>                 |
| 2               | 32              | 60,000          | <font color='red'>No</font>                  |
| 3               | 47              | 80,000          | <font color='green'>Yes</font>                 |
| 4               | 52              | 45,000          | <font color='red'>No</font>                  |
| 5               | 29              | 70,000          | <font color='green'>Yes</font>                 |

---

### <font color='green'>**Regression Problem**</font>
Predict the **house price** based on size (sq ft), number of bedrooms, and age of the house.

| **House ID** | **Size (sq ft)** | **Bedrooms** | **Age (Years)** | **Price (USD)** |
|--------------|------------------|--------------|-----------------|-----------------|
| 1            | 1500             | 3            | 10              | <font color='blue'>300,000</font>         |
| 2            | 2000             | 4            | 5               | <font color='blue'>450,000</font>         |
| 3            | 1800             | 3            | 8               | <font color='blue'>350,000</font>         |
| 4            | 2500             | 5            | 2               | <font color='blue'>600,000</font>         |
| 5            | 1700             | 2            | 15              | <font color='blue'>280,000</font>         |

<img src="https://drive.google.com/uc?id=1s6Nx_-u82ADxRj3dL5LgPbLRhk5lIuyj" alt="Types" width="700"/>
---


### **Example Algorithms**:
- Linear Regression  
- Logistic Regression
- k-Nearest Neighbors
- Decision Trees  
- Random Forest  
- Support Vector Machines (SVM)  
- Neural Networks  

---

## **<font color='green'>2. Unsupervised Learning</font>**

**Definition**:  
In unsupervised learning, the model learns patterns and structures from **unlabeled data** without any predefined labels or outcomes.

### **Goal**:  
- Discover hidden structures or patterns in data.

### **Key Concepts**  
- **Input Features (X)**: Independent variables with no target labels.  
- **Clusters**: Groups of similar data points.  
- **Dimensionality Reduction**: Simplifying data by reducing features.


### **Types**:
1. **<font color='teal'>Clustering</font>**: Group data points into similar clusters.  
   - Example: Segment customers into different groups based on their purchasing behavior.  
   - Example: Identify communities in a social network.

2. **<font color='teal'>Dimensionality Reduction</font>**: Reduce the number of features in the data while retaining important information.  
   - Example: Compress image data for faster processing.  
   - Example: Visualize high-dimensional data in 2D/3D.
  



## <font color='purple'>**Unsupervised Learning Example Data**</font>

In unsupervised learning, the data contains **input features only** (\(X\)). There are no target labels.

### <font color='green'>**Clustering Problem**</font>
Group customers based on their **age** and **spending score**.

| **Customer ID** | **Age (Years)** | **Spending Score** |
|-----------------|-----------------|--------------------|
| 1               | 22              | <font color='blue'>85</font>                 |
| 2               | 40              | <font color='red'>15</font>                 |
| 3               | 35              | <font color='blue'>60</font>                 |
| 4               | 28              | <font color='green'>77</font>                 |
| 5               | 55              | <font color='red'>20</font>                 |
| 6               | 19              | <font color='blue'>90</font>                 |
| 7               | 48              | <font color='red'>25</font>                 |

**Goal**: Discover groups of customers with similar purchasing behaviors.

---

### <font color='green'>**Dimensionality Reduction Problem**</font>
Reduce the number of features in a dataset (e.g., a dataset with multiple measurements on flowers).

| **Sample ID** | **Petal Length (cm)** | **Petal Width (cm)** | **Sepal Length (cm)** | **Sepal Width (cm)** |
|---------------|-----------------------|----------------------|-----------------------|----------------------|
| 1             | <font color='blue'>1.4</font>                   | <font color='green'>0.2</font>                  | 4.7                   | 3.2                  |
| 2             | <font color='blue'>1.5</font>                   | <font color='green'>0.4</font>                  | 5.1                   | 3.5                  |
| 3             | <font color='blue'>4.5</font>                   | <font color='red'>1.5</font>                  | 6.4                   | 2.9                  |
| 4             | <font color='blue'>5.1</font>                   | <font color='red'>1.8</font>                  | 7.0                   | 3.1                  |
| 5             | <font color='blue'>4.7</font>                   | <font color='red'>1.4</font>                  | 6.3                   | 3.3                  |

**Goal**: Use techniques like **PCA** to reduce these features into fewer dimensions while retaining important patterns.

<img src="https://drive.google.com/uc?id=1D26H9KYCGrsTHV9VkGju0gCo5Fvg5AN5" alt="Types" width="700"/>


### **Example Algorithms**:
- K-Means Clustering  
- Hierarchical Clustering  
- Principal Component Analysis (PCA)  
- t-SNE (t-Distributed Stochastic Neighbor Embedding)  

---

## **<font color='purple'>3. Reinforcement Learning</font>**

**Definition**:  
Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by performing actions in an environment to maximize cumulative rewards.

### **Goal**:  
- Train an agent to learn an optimal strategy by interacting with the environment and receiving feedback through **rewards** or **penalties**.

### **Key Concepts**:
- **<font color='orange'>Agent</font>**: The learner or decision-maker (e.g., a robot, game player).  
- **<font color='orange'>Environment</font>**: Where the agent interacts (e.g., a game, traffic simulation).  
- **<font color='orange'>Actions</font>**: Possible moves the agent can take.  
- **<font color='orange'>Rewards</font>**: Positive or negative feedback for actions taken.

### **Real-World Examples**:
- Training robots to navigate a room.  
- Teaching computers to play games like chess or Go.  
- Optimizing traffic light controls for smooth traffic flow.

### **Example Algorithms**:
- Q-Learning  
- Deep Q Networks (DQN)  
- Policy Gradient Methods  
- SARSA (State-Action-Reward-State-Action)  

---

## **<font color='darkorange'>4. Summary Table</font>**

| **Type**                        | **Definition**                                                                 | **Examples**                               | **Algorithms**                  |
|---------------------------------|-------------------------------------------------------------------------------|-------------------------------------------|---------------------------------|
| **<font color='blue'>Supervised Learning</font>**  | Learning from labeled data to make predictions.                              | Spam Detection, House Price Prediction    | Linear Regression, Decision Trees |
| **<font color='green'>Unsupervised Learning</font>**| Finding hidden patterns in unlabeled data.                                   | Customer Segmentation, Anomaly Detection  | K-Means, PCA, Hierarchical Clustering |
| **<font color='purple'>Reinforcement Learning</font>** | Learning by interacting with an environment to maximize rewards.            | Game Playing, Robotics, Self-Driving Cars | Q-Learning, DQN, Policy Gradient  |

---

## **<font color='blue'>5. Conclusion</font>**

- **<font color='blue'>Supervised Learning</font>** is suitable when labeled data is available for classification or regression tasks.  
- **<font color='green'>Unsupervised Learning</font>** is used for discovering patterns, clustering, or reducing dimensions in unlabeled data.  
- **<font color='purple'>Reinforcement Learning</font>** focuses on decision-making and learning through interactions with an environment to maximize rewards.

Each type of machine learning problem requires different algorithms and approaches, depending on the nature of the data and the objective.

---


# **<font color='darkorange'>Hypothesis Space and Inductive Bias</font>**

Understanding the concepts of **<font color='blue'>hypothesis</font>**, **<font color='blue'>hypothesis space</font>**, and **<font color='green'>inductive bias</font>** is crucial in machine learning, as they influence how models learn from data and make predictions.

---

## **<font color='blue'>1. What is a Hypothesis?</font>**

In **machine learning**, a **hypothesis** is a proposed explanation or model that maps input features ($X$) to outputs ($Y$). Simply put, it is a **guess** or **assumption** about the relationship between inputs and outputs based on the given data.

### **Analogy**:  
Imagine you are a detective solving a mystery. You look at the clues (input data) and propose a **hypothesis** about who committed the crime. Your job is to test this hypothesis by collecting more evidence (training data).  

Similarly, in machine learning:
- The **data** is your evidence.  
- The **hypothesis** is the model's assumption or guess.  
- The model's task is to verify which hypothesis best explains the data.

---

## **<font color='blue'>2. Hypothesis Space</font>**

**Definition**:  
The **hypothesis space** ($\mathcal{H}$) is the set of all possible hypotheses that a learning algorithm can consider. It defines the scope of potential models the algorithm can choose from.

### **Key Concepts**:

- **Hypothesis ($h$)**: A specific function that maps input features ($X$) to outputs ($\hat{Y}$).  
- **True Function ($f$)**: The actual relationship between input and output, which is often unknown.  
- **Model Capacity**: A larger hypothesis space can fit more complex patterns, but it may lead to overfitting.

### **Example**:  
In linear regression, the hypothesis space consists of all possible **linear functions** of the form:

$$
h(x) = w_0 + w_1 x_1 + w_2 x_2 + \ldots + w_n x_n
$$

where $w_0, w_1, \dots, w_n$ are parameters to be learned.

<img src="https://drive.google.com/uc?id=1zSPkIokR3tBfYfxh-YmcaT6o0xW3DvQK" width=700/>

---

## **<font color='green'>3. Inductive Bias</font>**

**Definition**:  
**Inductive bias** refers to the assumptions a learning algorithm makes to generalize beyond the training data. Since learning algorithms cannot test every possible hypothesis, they rely on inductive bias to guide the selection process.

### **Types of Inductive Bias**:

1. **<font color='blue'>Preference Bias</font>**: Prefers certain hypotheses over others (e.g., simpler models).  
2. **<font color='blue'>Restriction Bias</font>**: Limits the hypothesis space (e.g., only linear models).

---

## **<font color='purple'>4. Occam's Razor Principle</font>**

**Definition**:  
The **Occam's Razor Principle** states:  
> "Among competing hypotheses that explain the data equally well, the simplest one is preferred."

**Why?**  
Simpler models are easier to understand and more likely to generalize well to unseen data.

### **Analogy**:  
If two explanations solve a mystery equally well, a detective would prefer the simpler explanation because it is more likely to be correct.  

**Example in Machine Learning**:  
In regression tasks, a straight line (linear model) is preferred over a high-degree polynomial unless the data strongly suggests otherwise. A simpler model reduces the risk of overfitting.

---

## **<font color='darkorange'>5. Relationship Between Hypothesis Space and Inductive Bias</font>**

The **hypothesis space** and **inductive bias** are interconnected:

- The **hypothesis space** defines all the possible models a learning algorithm can choose.  
- The **inductive bias** influences which hypothesis (model) is selected from the hypothesis space.  

A well-chosen inductive bias helps the model **generalize** better by narrowing down the hypothesis space to plausible solutions.

---

# **<font color='darkorange'>Evaluation: Training, Validation, and Test Sets</font>**

Evaluating a machine learning model's performance ensures that it can generalize to unseen data. This involves splitting the dataset into **training**, **validation**, and **test** sets.

<img src="https://drive.google.com/uc?id=1m2PAx79t9UVOdu8yd_K-pnS1q2NnD4u1" width=700/>
---

## **<font color='blue'>1. Training Set</font>**

**Definition**:  
The **training set** is the portion of the dataset used to train the model and learn patterns.

**Key Purpose**:  
- The model learns the relationships between input features and target labels.  
- The model adjusts its parameters to minimize the error on this data.

**Example**:  
In a dataset of 1,000 samples, **800 samples** are used to train the model.

---

## **<font color='blue'>2. Validation Set</font>**

**Definition**:  
The **validation set** is a portion of the dataset used to tune the model's hyperparameters and assess performance during training.  

**Key Purpose**:  
- It helps in identifying the model's ability to generalize **before testing**.  
- It is used for **hyperparameter tuning** (e.g., adjusting learning rate, regularization strength, etc.).  
- Prevents overfitting by providing feedback on intermediate model performance.

**Example**:  
Out of the 1,000 samples, **100 samples** are reserved as a validation set to evaluate the model while tuning hyperparameters.

---

## **<font color='blue'>3. Test Set</font>**

**Definition**:  
The **test set** is a portion of the dataset reserved for the final evaluation of the model's performance on unseen data.

**Key Purpose**:  
- Provides an **unbiased evaluation** of the trained model.  
- Helps estimate the model's **generalization performance** on new data.

**Example**:  
The remaining **100 samples** are used as the test set to evaluate the final model's accuracy or error metrics.

---

## **<font color='green'>4. Why Split the Data?</font>**

- **Detect Overfitting**: Compare the model's performance on training, validation, and test sets to identify overfitting or underfitting.  
- **Hyperparameter Tuning**: Use the **validation set** to optimize the model's configuration without touching the test set.  
- **Estimate Generalization**: The test set provides a realistic measure of how the model will perform on unseen data.

---

## **<font color='darkorange'>5. Best Practices for Data Splitting</font>**

1. **Train/Validation/Test Split**:  
   - **70-80%** Training  
   - **10-15%** Validation  
   - **10-15%** Test  

2. **Cross-Validation**:  
   - Use **k-fold cross-validation** for robust evaluation, especially with small datasets.  
   - The dataset is split into $k$ subsets (folds). The model is trained $k$ times, using $k-1$ folds for training and 1 fold for validation each time.

---

## **<font color='purple'>6. Summary Table</font>**

| **Set**         | **Purpose**                             | **Example**                  |
|------------------|-----------------------------------------|------------------------------|
| **Training Set** | Train the model to learn relationships. | 800 samples out of 1,000     |
| **Validation Set** | Tune hyperparameters, detect overfitting. | 100 samples for validation   |
| **Test Set**     | Final evaluation on unseen data.       | 100 samples for testing      |

---

# **<font color='darkorange'>Conclusion</font>**

In this section:
- We defined the **hypothesis** and the **hypothesis space** as the set of all possible models.  
- We explained **inductive bias** and its importance in generalization.  
- We introduced **Occam's Razor**, a principle that favors simpler models.    
- We introduced the **training set**, **validation set**, and **test set**.  
- We explained the purpose of each set in evaluating machine learning models.  
- We highlighted the importance of using the validation set for **hyperparameter tuning** to avoid overfitting.  
- Best practices like **cross-validation** were discussed for more robust evaluation.

Understanding these concepts ensures the development of machine learning models that **generalize well** to unseen data, providing reliable and accurate predictions.


# 🧠 Main Challenges of Machine Learning

Machine Learning is powerful but comes with various challenges. Addressing these challenges with practical examples helps in building effective models.

---

## 1️⃣ Insufficient Quantity of Training Data

A Machine Learning model requires a significant amount of training data to generalize well.  

**Example:**  
Imagine you’re training a face recognition system with only 50 images. The system may fail to recognize new faces due to insufficient diversity in the dataset.  

**Solution:**  
- Collect more diverse images.
- Use data augmentation (e.g., rotate, crop, or flip images) to artificially increase the dataset size.

---

## 2️⃣ Nonrepresentative Training Data

The training data must reflect the conditions under which the model will operate.  

**Example:**  
Suppose you’re training a weather prediction model using data from only one city. When applied to another city with different weather patterns, the predictions are inaccurate.  

**Solution:**  
- Use a dataset that includes data from multiple regions.
- Check the dataset for potential biases during exploratory data analysis.

---

## 3️⃣ Poor-Quality Data

Poor data quality, like missing values, outliers, or noise, hampers model performance.  

**Example:**  
A customer churn prediction model trained on data with inconsistent entries, like missing income values or outlier ages (e.g., 200 years), will yield unreliable results.  

**Solution:**  
- Handle missing values (e.g., replace with mean/median).
- Remove outliers or cap extreme values.
- Apply noise-reduction techniques.

---

## 4️⃣ Irrelevant Features

Irrelevant or redundant features increase complexity without improving the model’s performance.  

**Example:**  
Predicting house prices with irrelevant features like the color of the house or the number of pets owned by the seller will confuse the model.  

**Solution:**  
- Perform **feature selection** using techniques like correlation analysis or mutual information.
- Use dimensionality reduction techniques such as PCA.

---

## 5️⃣ Overfitting the Training Data

The model performs exceptionally well on training data but poorly on unseen data.  

**Example:**  
A decision tree that memorizes every data point, including noise, may classify training samples perfectly but fail to generalize to test data.  

**Solution:**  
- Use regularization techniques like L1/L2 penalties.  
- Prune decision trees to limit complexity.  
- Perform k-fold cross-validation to evaluate generalization.

---

## 6️⃣ Underfitting the Training Data

The model is too simple to capture the patterns in the data.  

**Example:**  
Fitting a linear regression model to predict house prices in a dataset where the relationship is clearly nonlinear will result in poor predictions.  

**Solution:**  
- Use a more complex model like polynomial regression.  
- Ensure sufficient and relevant features are used.  
- Train for more epochs if applicable.

---

## 7️⃣ Testing and Validating

Improper testing or validation can lead to misleading results.  

**Example:**  
Using the same dataset for both training and testing will give an overly optimistic estimate of the model’s performance.  

**Solution:**  
- Split the dataset into training, validation, and test sets (e.g., 60/20/20 split).  
- Use **k-fold cross-validation** for robust evaluation.

---

## 8️⃣ Hyperparameter Tuning and Model Selection

Finding the best model and its optimal settings is challenging due to many hyperparameters.  

**Example:**  
Training a Random Forest model without tuning parameters like the number of trees or depth may result in suboptimal performance.  

**Solution:**  
- Use automated tuning methods like **Grid Search**, **Random Search**, or **Bayesian Optimization**.  
- Compare multiple models (e.g., SVM, Random Forest, Neural Networks) on validation metrics to select the best one.

---

## 9️⃣ Data Mismatch

Differences between training and production data (data drift) degrade performance.  

**Example:**  
A customer recommendation model trained on last year’s product preferences might fail because customer interests have changed.  

**Solution:**  
- Regularly retrain the model with fresh data.  
- Use domain adaptation techniques to adjust the model for new conditions.

By addressing these challenges thoughtfully and leveraging appropriate techniques, you can build more robust and effective Machine Learning systems.
---

# Example

In [1]:
!wget https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt

--2025-04-27 23:30:11--  https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... failed: Temporary failure in name resolution.
wget: unable to resolve host address ‘raw.githubusercontent.com’


In [2]:
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'mobilenet_v2', pretrained=True)
model.eval()

# Download an example image from the pytorch website
import urllib
url, filename = ("https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/640px-Cat03.jpg", "cat.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

# sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model


with torch.no_grad():
    output = model(input_batch)

# The output has unnormalized scores. To get probabilities, you can run a softmax on it.
probabilities = torch.nn.functional.softmax(output[0], dim=0)


# Read the categories
with open("imagenet_classes.txt", "r") as f:
    categories = [s.strip() for s in f.readlines()]

# Show top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
    print(categories[top5_catid[i]], top5_prob[i].item())

# Step 7: Display the Image
import matplotlib.pyplot as plt
plt.imshow(input_image)
plt.show()

URLError: <urlopen error [Errno -3] Temporary failure in name resolution>