# Genetic Algorithm (GA) / Algoritmo Genético (AG)

## **English Summary**  
**Genetic Algorithms** are optimization techniques inspired by **natural selection** and **evolutionary biology**. They mimic the process of natural evolution to solve complex problems by iteratively improving candidate solutions.  

### Key Concepts:  
- **Population**: A set of candidate solutions (individuals) encoded as chromosomes (e.g., binary strings).  
- **Fitness Function**: Evaluates how well a solution solves the problem.  
- **Selection**: Prioritizes individuals with higher fitness to "reproduce" (e.g., roulette wheel, tournament selection).  
- **Crossover**: Combines genetic information from two parents to create offspring.  
- **Mutation**: Introduces random changes to maintain genetic diversity and avoid local optima.  
- **Replacement**: Generates a new population by replacing weak individuals with offspring.  
- **Termination**: Stops when a solution meets criteria (e.g., max generations, target fitness).  

### Applications:  
- Optimization (parameters, scheduling, routing).  
- Machine learning (hyperparameter tuning, feature selection).  
- Engineering design and robotics.  

```python
# Simplified GA Example (Pseudocode)  
population = initialize_population()  
while not termination_condition:  
    fitness = evaluate(population)  
    parents = select_parents(population, fitness)  
    offspring = crossover(parents)  
    offspring = mutate(offspring)  
    population = replace(population, offspring)  
```

## **Resumo em Português do Brasil**  
**Algoritmos Genéticos** são técnicas de otimização inspiradas na **seleção natural** e na **biologia evolutiva**. Eles simulam o processo de evolução natural para resolver problemas complexos, aprimorando iterativamente soluções candidatas.  

### Conceitos-Chave:  
- **População**: Conjunto de soluções candidatas (indivíduos) codificadas como cromossomos (ex.: strings binárias).  
- **Função de Aptidão**: Avalia a qualidade de uma solução para o problema.  
- **Seleção**: Prioriza indivíduos com maior aptidão para "reproduzir" (ex.: roleta, seleção por torneio).  
- **Cruzamento**: Combina informações genéticas de dois pais para gerar descendentes.  
- **Mutação**: Introduz alterações aleatórias para manter diversidade genética e evitar ótimos locais.  
- **Substituição**: Gera nova população substituindo indivíduos menos aptos por descendentes.  
- **Término**: Interrompe quando uma solução atinge critérios (ex.: gerações máximas, aptidão desejada).  

### Aplicações:  
- Otimização (parâmetros, escalonamento, roteamento).  
- Aprendizado de máquina (ajuste de hiperparâmetros, seleção de características).  
- Projetos de engenharia e robótica.  

```python
# Exemplo Simplificado de AG (Pseudocódigo)  
população = inicializar_população()  
enquanto não condição_de_parada:  
    aptidão = avaliar(população)  
    pais = selecionar_pais(população, aptidão)  
    descendentes = cruzamento(pais)  
    descendentes = mutação(descendentes)  
    população = substituir(população, descendentes)  
```

---  
*Explore mais implementações em repositórios de algoritmos bio-inspirados!* 🧬🔍  
```

In [1]:
def fitness_function():
    pass

In [2]:
pip install fireducks

Collecting fireducks
  Downloading fireducks-1.2.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (1.0 kB)
Collecting firefw==1.2.0 (from fireducks)
  Downloading firefw-1.2.0-py3-none-any.whl.metadata (818 bytes)
Collecting pandas<2.3.0,>=1.5.3 (from fireducks)
  Downloading pandas-2.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (89 kB)
Collecting pyarrow<19.1,>=19.0 (from fireducks)
  Downloading pyarrow-19.0.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (3.3 kB)
Collecting pytz>=2020.1 (from pandas<2.3.0,>=1.5.3->fireducks)
  Downloading pytz-2025.1-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas<2.3.0,>=1.5.3->fireducks)
  Using cached tzdata-2025.1-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading fireducks-1.2.0-cp310-cp310-manylinux_2_28_x86_64.whl (7.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0mm
[?25hDownloading firefw-1.2.0-

In [3]:
import tensorflow as tf

2025-01-31 08:34:32.140500: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1738323274.111774    9505 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738323274.569320    9505 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-31 08:34:49.535679: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [4]:
# Verificar se a GPU está disponível
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    print("GPU está disponível.")
    for gpu in gpus:
        print(f"Nome da GPU: {gpu.name}")
else:
    print("GPU não está disponível.")

GPU está disponível.
Nome da GPU: /physical_device:GPU:0


In [5]:
tf.config.list_logical_devices('GPU')

I0000 00:00:1738323523.583288    9505 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 2248 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5


[LogicalDevice(name='/device:GPU:0', device_type='GPU')]

In [7]:
import fireducks.pandas as fd
import pandas as pd
import numpy as np
import time

In [14]:
def generate_data(rows=100000):
    np.random.seed(42)
    base_dates = pd.date_range(start='2020-01-01', periods=3650, freq='D')
    data = {
        'id': np.arange(rows),
        'category': np.random.choice(['A', 'B', 'C', 'D'], size=rows),
        'value': np.random.randint(1, 1000, size=rows),
        'date': np.random.choice(base_dates, size=rows)
    }
    return pd.DataFrame(data)

def benchmark_pandas(df: pd.DataFrame):
    results = {}
    
    start = time.time()
    filtered = df[df['value'] > 500]
    results['filtering_time'] = time.time() - start
    
    start = time.time()
    grouped = df.groupby('category')['value'].sum()
    results['grouping_time'] = time.time() - start
    
    other_df = df[['category', 'date']].drop_duplicates()
    start = time.time()
    joined = df.merge(other_df, on='category')
    results['joining_time'] = time.time() - start
    
    return results

if __name__ == "__main__":
    sizes = [100000, 200000, 500000]
    for size in sizes:
        print(f"\nBenchmarking com {size} linhas:")
        df = generate_data(size)
        results = benchmark_pandas(df)
        
        print(f"Tempo de filtering: {results['filtering_time']:.4f}s")
        print(f"Tempo de grouping: {results['grouping_time']:.4f}s")
        print(f"Tempo de joining: {results['joining_time']:.4f}s")


Benchmarking com 100000 linhas:


: 

In [None]:
def generate_data(rows=100000):
    np.random.seed(42)
    base_dates = fd.date_range(start='2020-01-01', periods=3650, freq='D')
    data = {
        'id': np.arange(rows),
        'category': np.random.choice(['A', 'B', 'C', 'D'], size=rows),
        'value': np.random.randint(1, 1000, size=rows),
        'date': np.random.choice(base_dates, size=rows)
    }
    return fd.DataFrame(data)

def benchmark_pandas(df: fd.DataFrame):
    results = {}
    
    start = time.time()
    filtered = df[df['value'] > 500]
    results['filtering_time'] = time.time() - start
    
    start = time.time()
    grouped = df.groupby('category')['value'].sum()
    results['grouping_time'] = time.time() - start
    
    other_df = df[['category', 'date']].drop_duplicates()
    start = time.time()
    joined = df.merge(other_df, on='category')
    results['joining_time'] = time.time() - start
    
    return results

if __name__ == "__main__":
    sizes = [100000, 200000, 500000]
    for size in sizes:
        print(f"\nBenchmarking com {size} linhas:")
        df = generate_data(size)
        results = benchmark_pandas(df)
        
        print(f"Tempo de filtering: {results['filtering_time']:.4f}s")
        print(f"Tempo de grouping: {results['grouping_time']:.4f}s")
        print(f"Tempo de joining: {results['joining_time']:.4f}s")