# **Install pip**

# **Install tpot and h2o in it**

In [3]:
pip install tpot h2o

Note: you may need to restart the kernel to use updated packages.


# **Install Packages:**

## pandas- **Data Manipulation and Analysis Library**


## TPOTClassifier- **Automated Machine Learning Tool for Classification**

## train_test_split- **Function to Split Datasets for Training and Testing**

## torch.nn- **Module for Building Neural Networks in PyTorch**

In [4]:
import pandas as pd
from tpot import TPOTClassifier
from sklearn.model_selection import train_test_split
import torch.nn as nn


## **Read the dataset**

In [5]:
data = pd.read_csv("california_housing_train.csv")
data



Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value
0,-114.31,34.19,15.0,5612.0,1283.0,1015.0,472.0,1.4936,66900.0
1,-114.47,34.40,19.0,7650.0,1901.0,1129.0,463.0,1.8200,80100.0
2,-114.56,33.69,17.0,720.0,174.0,333.0,117.0,1.6509,85700.0
3,-114.57,33.64,14.0,1501.0,337.0,515.0,226.0,3.1917,73400.0
4,-114.57,33.57,20.0,1454.0,326.0,624.0,262.0,1.9250,65500.0
...,...,...,...,...,...,...,...,...,...
16995,-124.26,40.58,52.0,2217.0,394.0,907.0,369.0,2.3571,111400.0
16996,-124.27,40.69,36.0,2349.0,528.0,1194.0,465.0,2.5179,79000.0
16997,-124.30,41.84,17.0,2677.0,531.0,1244.0,456.0,3.0313,103600.0
16998,-124.30,41.80,19.0,2672.0,552.0,1298.0,478.0,1.9797,85800.0


# **Analysing the data**

## **Reading the head of the data**

In [6]:
data.head()

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value
0,-114.31,34.19,15.0,5612.0,1283.0,1015.0,472.0,1.4936,66900.0
1,-114.47,34.4,19.0,7650.0,1901.0,1129.0,463.0,1.82,80100.0
2,-114.56,33.69,17.0,720.0,174.0,333.0,117.0,1.6509,85700.0
3,-114.57,33.64,14.0,1501.0,337.0,515.0,226.0,3.1917,73400.0
4,-114.57,33.57,20.0,1454.0,326.0,624.0,262.0,1.925,65500.0


## **Defining the shape of the data**

In [8]:
data.shape

(17000, 9)

## **Describing the data**

In [9]:
data.describe()

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value
count,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0,17000.0
mean,-119.562108,35.625225,28.589353,2643.664412,539.410824,1429.573941,501.221941,3.883578,207300.912353
std,2.005166,2.13734,12.586937,2179.947071,421.499452,1147.852959,384.520841,1.908157,115983.764387
min,-124.35,32.54,1.0,2.0,1.0,3.0,1.0,0.4999,14999.0
25%,-121.79,33.93,18.0,1462.0,297.0,790.0,282.0,2.566375,119400.0
50%,-118.49,34.25,29.0,2127.0,434.0,1167.0,409.0,3.5446,180400.0
75%,-118.0,37.72,37.0,3151.25,648.25,1721.0,605.25,4.767,265000.0
max,-114.31,41.95,52.0,37937.0,6445.0,35682.0,6082.0,15.0001,500001.0


**This code splits the dataset into features (X) and target variable (y), and then further divides them into training and testing sets, with20% of the data reserved for testing.**

In [13]:
X = data.drop('median_house_value', axis=1)
y = data['median_house_value']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

**This code defines a simple feedforward neural network called PrototypicalNetwork with two linear layers, designed to process input features and produce output predictions for classification tasks.**

In [14]:
class PrototypicalNetwork(nn.Module):
    def __init__(self, in_features=64, out_features=5): 
        super(PrototypicalNetwork, self).__init__()
        self.layer1 = nn.Linear(in_features, 128)
        self.layer2 = nn.Linear(128, out_features)

    def forward(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        return x

## **Install huggingface_hub using pip** 

In [15]:
pip install --upgrade huggingface_hub

Collecting huggingface_hub
  Downloading huggingface_hub-0.26.1-py3-none-any.whl.metadata (13 kB)
Downloading huggingface_hub-0.26.1-py3-none-any.whl (447 kB)
   ---------------------------------------- 0.0/447.4 kB ? eta -:--:--
   ---------------------------------------- 0.0/447.4 kB ? eta -:--:--
   --- ------------------------------------ 41.0/447.4 kB 1.9 MB/s eta 0:00:01
   -------- ------------------------------- 92.2/447.4 kB 1.3 MB/s eta 0:00:01
   -------- ------------------------------- 92.2/447.4 kB 1.3 MB/s eta 0:00:01
   ---------- --------------------------- 122.9/447.4 kB 901.1 kB/s eta 0:00:01
   ------------ ------------------------- 143.4/447.4 kB 708.1 kB/s eta 0:00:01
   ------------- ------------------------ 153.6/447.4 kB 654.6 kB/s eta 0:00:01
   ------------- ------------------------ 153.6/447.4 kB 654.6 kB/s eta 0:00:01
   ------------- ------------------------ 153.6/447.4 kB 654.6 kB/s eta 0:00:01
   -------------- ----------------------- 174.1/447.4 kB 476.3

## transformers- **Hugging Face Library for NLP and Vision Models**

## PIL- **Python Imaging Library for Image Processing**

### requests-**Library for Making HTTP Requests**

## io- **Module for Input and Output Operations in Python**

In [16]:
from transformers import ViltProcessor, ViltForQuestionAnswering, ViltConfig
from PIL import Image
import requests
from io import BytesIO

**Reading huggingface token by the help of having a hugging face account**

In [16]:
HUGGINGFACE_TOKEN = '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'

Here’s a simplified explanation of the code, broken down into more straightforward terms:

1. **Load the VILT Processor**:
 - This sets up a tool that prepares images and questions for the VILT model.

2. **Load the VILT Model**:
 - This loads a pre-trained model that answers questions based on images.

3. **Set the Image URL**:
 - This line has the web address of the image to be analyzed.

4. **Download the Image**:
 - The program fetches the image from the internet using the URL.

5. **Open the Image**:
 - Opens the downloaded image so it can be processed.

6. **Define the Question**:
 - This is the question you're asking about the image, like "How many cats are there?"

7. **Prepare Inputs**:
 - Combines the image and question in a format the model can understand.

8. **Get Model's Prediction**:
 - The model processes the inputs to find the answer.

9. **Extract Scores**:
 - Gets the model's confidence scores for different answers.

10. **Find the Best Answer**:
 - Identifies which answer the model thinks is the best based on the scores.

11. **Map to Answer Text**:
 - Turns the best answer's index into actual words.

12. **Print the Predicted Answer**:
 - Displays the answer the model predicted based on the image.

This code ultimately fetches an image, asks a question about it, and outputs the predicted answer from the VILT model.


In [14]:

processor = ViltProcessor.from_pretrained("dandelin/vilt-b32-finetuned-vqa", use_auth_token=HUGGINGFACE_TOKEN, revision="main")
model = ViltForQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", use_auth_token=HUGGINGFACE_TOKEN, revision="main")

image_url = "http://images.cocodataset.org/val2017/000000039769.jpg"  
response = requests.get(image_url)
image = Image.open(BytesIO(response.content))


question = "How many cats are there?"  


inputs = processor(images=image, text=question, return_tensors="pt")
outputs = model(**inputs)

logits = outputs.logits
predicted_index = logits.argmax(-1).item()  
answer = model.config.id2label[predicted_index]  


print("Predicted Answer:", answer)



preprocessor_config.json:   0%|          | 0.00/251 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


tokenizer_config.json:   0%|          | 0.00/320 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/136k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/470M [00:00<?, ?B/s]

Predicted Answer: 2


Here are the small headings for each package in your code:

### stable-baselines3- **Library for Reinforcement Learning in Python**

### gymnasium- **Toolkit for Developing and Comparing Reinforcement Learning Environments**


In [17]:
!pip install stable-baselines3
!pip install gymnasium
import gymnasium as gym
from stable_baselines3 import PPO



## **Create Environment```pythonenv = gym.make("CartPole-v1")**

## **Train the Model```pythonmodel.learn(total_timesteps=10000)**

In [18]:
env = gym.make("CartPole-v1")  
model = PPO("MlpPolicy", env)
model.learn(total_timesteps=10000)

<stable_baselines3.ppo.ppo.PPO at 0x237971278f0>

## **Import ray**

In [20]:
import ray

## **Initialize Ray```pythonray.init()**

## **Parallelize Training with Ray```pythonfutures = [train_model.remote(data_chunk) for data_chunk in data.to_numpy()]**

In [21]:
ray.init()
def train_model(data):
  futures = [train_model.remote(data_chunk) for data_chunk in data.to_numpy()]
  results = ray.get(futures)

2024-10-21 00:07:17,070	INFO worker.py:1786 -- Started a local Ray instance.
