# **News: Gemini (Google)**

Website: https://deepmind.google/technologies/gemini/

Blogpost: https://blog.google/technology/ai/google-gemini-ai

Mutlimodality video: https://twitter.com/GoogleDeepMind/status/1732461149554094259

Paper: https://paperswithcode.com/paper/gemini-a-family-of-highly-capable-multimodal

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('jV1vkHv4zq8')

# **Keras 3.0**

Most important features:
- multi-backend (Tensorflow, JAX, Pytorch)
- Keras Ops module
- optimized & redesigned
- KerasCV & KerasNLP packages

## **Instalation**

From command line:
```bash
pip3 install keras --upgrade
```

For Anaconda environment:
```bash
conda activate Yourenvironment
conda install pytorch
pip3 install keras --upgrade
```
   
When making a new env:
```bash
conda create -n Keras3 python=3.11 pytorch #(or tensorflow or jax)
conda activate Keras3
pip3 install keras --upgrade
```

For Conda installation, GPU support should set-up automatially. If you encounter issues, checkout the current requirements at https://github.com/keras-team/keras/tree/master.

## **Import & select backend**

From Python script or notebook:
```python
import os
os.environ["KERAS_BACKEND"] = "jax"
```

From command line:
```bash
export KERAS_BACKEND="jax"
```

Permanently:

edit file `keras.json` (usually it's located in `.keras` hidden folder in your home directory)
```json
{
    ...
    "backend": "jax",
    ...
}
```

In [None]:
import matplotlib.pyplot as plt

import os
os.environ["KERAS_BACKEND"] = "torch"

from keras import ops
from keras import layers
from keras.models import Model, Sequential
from keras.metrics import CategoricalAccuracy
from keras.utils import plot_model, to_categorical
from keras.optimizers import Adam
from keras.datasets import mnist

## **Load data**

In [None]:
(X_train, y_train), (X_val, y_val) = mnist.load_data()

# Normalize 
X_train = X_train / 255
X_val = X_val / 255  

In [None]:
plt.figure(figsize=(7,7))
x = 1
for i in range(5):
    for j in range(5):
        plt.subplot(5,5,x)
        plt.title(f"Label : {y_train[x]}")
        plt.imshow(X_train[x], cmap="gray");
        plt.axis("off")
        x += 1

### **Transform X**

In [None]:
X_train.shape

In [None]:
# Reshape
X_train = X_train.reshape((*X_train.shape, 1))  
X_val = X_val.reshape((*X_val.shape, 1))

In [None]:
X_train.shape

### **Transform Y**

In [None]:
y_train[0]

In [None]:
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)

In [None]:
y_train[0]

In [None]:
ops.argmax(y_train[0])

## **Define network**

In [None]:
def get_model():
    model = Sequential(
        [layers.Input(shape=(28, 28, 1)),
         layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
         layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
         layers.GlobalAveragePooling2D(),
         layers.Dropout(0.2),
         layers.Dense(10, activation="softmax")]
    )
    
    return model

model = get_model()

In [None]:
model.summary()

In [None]:
plot_model(
    model,
    dpi=70,
    show_shapes=True,
    show_dtype=True,
    show_layer_activations=True,
    rankdir="TB"
)

## **Train model**

### **Default `.fit` method**

In [None]:
model = get_model()

model.compile(optimizer=Adam(learning_rate=0.002),
              loss='categorical_crossentropy',
              metrics=['categorical_accuracy'])

model.fit(X_train, y_train,
          validation_data=(X_val, y_val),
          batch_size=64, epochs=10)

### **Custom fit**

In [None]:
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

#### **optimizer \& loss function**

In [None]:
# Load Keras model
model = get_model()

# Optimizer: Adam
optimizer = torch.optim.Adam(model.parameters(), lr=0.005)

# Loss function: Crossenstropy
loss_fn = torch.nn.CrossEntropyLoss()

def loss_fn(y_true, y_pred):
    # Clipping - keeps the sum of y_pred at 1
    y_pred = ops.clip(y_pred, 1e-7, 1 - 1e-7)
    
    loss = -(y_true * ops.log(y_pred) + (1 - y_true) * ops.log(1 - y_pred))
    return ops.mean(loss)

#### **1. input array (batchsize $\,\times\,$ image shape)**

In [None]:
batchsize = 16

inputs = torch.tensor(X_train[0:batchsize], device=device)
targets = torch.tensor(y_train[0:batchsize], device=device)

print(inputs.shape, targets.shape)

In [None]:
print(inputs.device, targets.device)

#### **2. forward pass - array of probabilities (batchsize $\,\times\,$ classes)**

In [None]:
probs = model(inputs)

probs.shape

In [None]:
probs[0]

In [None]:
probs[0].sum()

#### **3. calculate the loss**

In [None]:
loss = loss_fn(probs, targets)

loss

#### **4. backward pass**

In [None]:
model.zero_grad()

loss.backward()

#### **5. optimizer**

In [None]:
optimizer.step()

#### **check if model changed**

In [None]:
probs2 = model(inputs)

loss2 = loss_fn(probs2, targets)

loss2

### **Training loop**

In [None]:
# Load Keras model
model = get_model()

# Optimizer: Adam
optimizer = torch.optim.Adam(model.parameters(), lr=0.02)

# Loss function: Crossenstropy
loss_fn = torch.nn.CrossEntropyLoss()


batchsize = 128
epochs = 10
number_of_updates = X_train.shape[0] // batchsize

for epoch in range(epochs):
    print(f"Epoch: {epoch+1}")
    
    for step in range(number_of_updates):
        i0 = step * batchsize
        i1 = (step+1) * batchsize
        
        # Pytorch Tensor + Send to GPU
        inputs = torch.tensor(X_train[i0:i1], device=device)
        targets = torch.tensor(y_train[i0:i1], device=device)
        
        # Forward pass
        probs = model(inputs)
        loss = loss_fn(probs, targets)

        # Backward pass
        model.zero_grad()
        loss.backward()

        # Optimizer variable updates
        optimizer.step()

        # Log every 100 batches.
        if i1 % 1000 == 0:
            print(f"Training loss at step {step:4d} ({(step + 1) * batchsize:5d} images): {loss.cpu().detach().numpy():.4f}")

### **Measure accuracy**

In [None]:
# Load Keras model
model = get_model()

# Optimizer: Adam
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Loss function: Crossenstropy
loss_fn = torch.nn.CrossEntropyLoss()

# Metrcis: Categorical Accuracy
train_acc_metric = CategoricalAccuracy()
val_acc_metric = CategoricalAccuracy()


batchsize = 128
epochs = 10
number_of_updates = X_train.shape[0] // batchsize

for epoch in range(epochs):
    print(f"Epoch: {epoch+1}")
    
    for step in range(number_of_updates):
        i0 = step * batchsize
        i1 = (step+1) * batchsize
        
        # Pytorch Tensor + Send to GPU
        inputs = torch.tensor(X_train[i0:i1], device=device)
        targets = torch.tensor(y_train[i0:i1], device=device)
        
        # Forward pass
        probs = model(inputs)
        loss = loss_fn(probs, targets)

        # Update metrics
        train_acc_metric.update_state(targets, probs)
        
        # Backward pass
        model.zero_grad()
        loss.backward()

        # Optimizer variable updates
        optimizer.step()

        # Log every 100 batches.
        if i1 % 1000 == 0:
            print(f"Training loss at step {step:4d} ({(step + 1) * batchsize:5d} images): {loss.cpu().detach().numpy():.4f}")
            
    train_acc = train_acc_metric.result()
    print(f"Training acc: {float(train_acc):.4f}")
    
    # Reset training metrics at the end of each epoch
    train_acc_metric.reset_state()

    # Calculate 
    val_probs = model(X_val, training=False)
    # Update val metrics
    val_acc_metric.update_state(y_val, val_probs)
    val_acc = val_acc_metric.result()
    val_acc_metric.reset_state()
    print(f"Validation acc: {float(val_acc):.4f}\n")

### **Adding a scheduler**

In [None]:
# Load Keras model
model = get_model()

# Optimizer: Adam
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.8)

# Loss function: Crossenstropy
loss_fn = torch.nn.CrossEntropyLoss()

# Metrcis: Categorical Accuracy
train_acc_metric = CategoricalAccuracy()
val_acc_metric = CategoricalAccuracy()


batchsize = 128
epochs = 10
number_of_updates = X_train.shape[0] // batchsize

for epoch in range(epochs):
    print(f"Epoch: {epoch+1}")
    print(f"Learning rate: {optimizer.param_groups[0]['lr']}")
    
    for step in range(number_of_updates):
        i0 = step * batchsize
        i1 = (step+1) * batchsize
        
        # Pytorch Tensor + Send to GPU
        inputs = torch.tensor(X_train[i0:i1], device=device)
        targets = torch.tensor(y_train[i0:i1], device=device)
        
        # Forward pass
        probs = model(inputs)
        loss = loss_fn(probs, targets)

        # Update metrics
        train_acc_metric.update_state(targets, probs)
        
        # Backward pass
        model.zero_grad()
        loss.backward()

        # Optimizer variable updates
        optimizer.step()

        # Log every 100 batches.
        if i1 % 1000 == 0:
            print(f"Training loss at step {step:4d} ({(step + 1) * batchsize:5d} images): {loss.cpu().detach().numpy():.4f}")
            
    scheduler.step()
            
    train_acc = train_acc_metric.result()
    print(f"Training acc: {float(train_acc):.4f}")
    
    # Reset training metrics at the end of each epoch
    train_acc_metric.reset_state()

    # Calculate 
    val_probs = model(X_val, training=False)
    # Update val metrics
    val_acc_metric.update_state(y_val, val_probs)
    val_acc = val_acc_metric.result()
    val_acc_metric.reset_state()
    print(f"Validation acc: {float(val_acc):.4f}\n")

## **Final projects**

Datasets:
- [AstroNN](https://astronn.readthedocs.io/en/stable/galaxy10.html)
- [AstroML](https://www.astroml.org/user_guide/datasets.html)
- [Kaggle](https://www.kaggle.com/)
- [HuggingFace](https://huggingface.co/datasets)

Or build you own dataset:
- filter & preprocess data: 
    - light curves ([OGLE](https://ogledb.astrouw.edu.pl/~ogle/OCVS/))
    - X-ray data ([Chandra](https://cxcfps.cfa.harvard.edu/cda/footprint/cdaview.html))
    - GAIA data ([Astroquery](https://www.cosmos.esa.int/web/gaia-users/archive/programmatic-access))
    - VLBI images ([astrogeo](http://astrogeo.org/vlbi_images/)) 
    - SwiftXRT data ([Swift database](https://www.swift.ac.uk/user_objects/))
- simulate data: 
    - gamma-ray bursts ([cosmogrb](https://github.com/grburgess/cosmogrb))
    - X-ray spectrum ([Sherpa](https://cxc.cfa.harvard.edu/sherpa/threads/fake_pha/))
    - X-ray image ([pyxsim](https://hea-www.cfa.harvard.edu/~jzuhone/pyxsim/cookbook/Thermal_Emission.html))