# Redis Data Integration (RDI)
Redis Enterprise feature that helps users ingest data in near real-time.
## RDI currently supports 2 scenarios:
- ### Data Ingest
![diagram of ingest feature](img/ingest.png)
- ### Write-behind 
(Note, Write-behind is currently in public preview.)
![diagram of write-behind](img/write-behind.png) 

### Suported Sources for Ingest
- Oracle
- MariaDB
- MongoDB
- MySQL
- Percona XtraDB
- Postgres
- SQL Server
- Cassandra
- Datastx Cassandra


### Supported targets (write-behind) 
- Oracle
- MariaDB
- MySQL
- Postgres
- SQL Server
- Cassandra

## RDI Demo Environment
![demo environment schematic](img/topology.png)

## Data Ingest (Change Data Capture) Demo
**Here's the script to continually generate semi-random data and `INSERT` it into Postgres.**


In [None]:
import random
import time

import pandas as pd
from sqlalchemy import create_engine

DB_HOST = "postgresql"
DB_PORT = 5432
DB_NAME = "chinook"
DB_USER = "postgres"
DB_PASSWORD = "postgres"


def main():
    """A script to generate and insert random track data into the 'Track' table of the Chinook database.

    This script connects to the Chinook database and continuously inserts new track records
    with random data into the 'Track' table. It uses a CSV file containing track names and composers
    and generates random values for other track attributes such as genre, milliseconds, bytes, and unit price.
    """
    # read chinook track data from CSV
    track_df = pd.read_csv("track.csv", usecols=["Name", "Composer"])

    # connect to database
    print("connecting to DB")
    engine = create_engine(f"postgresql+psycopg2://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}")
    conn = engine.connect()

    # fetch largest track id
    res = conn.execute("""SELECT COALESCE(MAX("TrackId"), 0) FROM "Track" """).fetchall()
    track_id = res[0][0]

    while True:
        track_rand_id = random.randrange(2, 3000)
        track_name = track_df.iloc[track_rand_id, 0]
        track_genre = random.randrange(1, 5)
        track_composer = track_df.iloc[track_rand_id, 1]
        track_milliseconds = random.randrange(100000, 300000)
        track_bytes = random.randrange(100000, 500000)
        track_id += 1

        insert_stmt = """INSERT INTO public."Track"
                ("TrackId", "Name", "AlbumId", "MediaTypeId", "GenreId", "Composer", "Milliseconds", "Bytes", "UnitPrice")
                VALUES (%s, %s, 1, 1, %s, %s, %s, %s, 0.99)"""
        conn.execute(insert_stmt, (track_id, track_name, track_genre, track_composer, track_milliseconds, track_bytes))

        print(".", end="", flush=True)
        time.sleep(random.randint(100, 500)/1000)


if __name__ == "__main__":
    main()

**We'll use the RDI CLI to deploy.**

![command line screenshot of the deploy](img/redis_di_deploy.png)

`redis-di deploy --rdi-host re-n1 --rdi-port 12001 --rdi-password redislabs --dir redis_di_config`
1. `redis-di` : command line tool to manage & configure Redis Data Integration
2. `deploy` : This command deploys the RDI configurations.
3. The connection:\
  `--rdi-host re-n1`\
  `--rdi-port 12001`\
  `--rdi-password redislabs`
4. `--dir redis_di_config` : Directory containing RDI configuration

## Write Behind (caching) Demo
**We'll use the RDI CLI to deploy this too, but with different configurations.**
![RDI configuration step](img/redis-di-config.jpg)
1. We go to the directory that has our configuration.
`cd redis_di_wb_config`\
This is *in lieu of* specifying the --dir like we did above.
2. Run the configuration.\
`redis-di configure --rdi-host re-n1 --rdi-port 12000`
3. Use the CLI to deploy like before.\
![RDI wb deployment step](img/redis-di-deploy.jpg)
`redis-di deploy --rdi-host re-n1 --rdi-port 12000 --rdi-password ""`
- `redis-di` : command line tool to manage & configure Redis Data Integration
- `deploy` : This command deploys the RDI configurations.
- The connection:\
  `--rdi-host re-n1`\
  `--rdi-port 12000`\
  `--rdi-password ""` (empty string)


Copyright 2023, Redis Inc., All rights reserved.