# MySQL AI
MySQL AI provides developers to build rich applications with MySQL leveraging built in machine learning, GenAI, LLMs and semantic search. They can create vectors from documents stored in a local file system. Customers can deploy these AI applications on premise or migrate them to MySQL HeatWave for lower cost, higher performance, richer functionality and latest LLMs with no change to their application. This gives developers the flexibility to build their applications on MySQL EE and then deploy them either on premise or in the cloud.

This notebook demonstrates the application of [ML_GENERATE](https://dev.mysql.com/doc/heatwave/en/mys-hwgenai-ml-generate.html) for content generation using data from the 2024 Olympic Games.

### References
- https://blogs.oracle.com/mysql/post/announcing-mysql-ai
- https://dev.mysql.com/doc/mysql-ai/9.4/en/
- https://dev.mysql.com/doc/dev/mysql-studio/latest/#overview
- https://www.economicsobservatory.com/what-happened-at-the-2024-olympics
- https://en.wikipedia.org/wiki/2024_Summer_Olympics

### Prerequistises

- mysql-connector-python
- pandas 

##### Import Python packages

In [3]:
# import Python packages
import os
import json
import numpy as np
import pandas as pd
import mysql.connector

### Connect to MySQL AI instance
We create a connection to an active MySQL AI instance using the [MySQL Connector/Python](https://dev.mysql.com/doc/connector-python/en/). We also define an API to execute a SQL query using a cursor, and the result is returned as a Pandas DataFrame. Modify the below variables to point to your MySQL AI instance.

 - In MySQL Studio, connections are restricted to only allow localhost as the host. 
 - In MySQL Studio, the only accepted password values are the string unused or None. 

In [None]:
HOST = 'localhost'
PORT = 3306
USER = 'root'
PASSWORD = 'unused'
DATABASE = 'mlcorpus'


myconn = mysql.connector.connect(
    host=HOST,
    port=PORT,
    user=USER,
    password=PASSWORD,
    database=DATABASE,
    allow_local_infile=True,
    use_pure=True,
    autocommit=True,
)
mycursor = myconn.cursor()


# Helper function to execute SQL queries and return the results as a Pandas DataFrame
def execute_sql(sql: str) -> pd.DataFrame:
    mycursor.execute(sql)
    return pd.DataFrame(mycursor.fetchall(), columns=mycursor.column_names)


# ML_GENERATE operation

You can perform content generation using the [ML_GENERATE](https://dev.mysql.com/doc/heatwave/en/mys-hwgenai-ml-generate.html) procedure.

### Content generation

Invoke the [ML_GENERATE](https://dev.mysql.com/doc/heatwave/en/mys-hwgenai-ml-generate.html) procedure to query the LLM.

In [5]:
df = execute_sql(f"""SELECT JSON_PRETTY(sys.ML_GENERATE("In which year were the first modern Olympic Games held?", NULL));""")
json.loads(df.iat[0,0])["text"]

'The first modern Olympic Games were held in 1896, in Athens, Greece. They were organized by the International Olympic Committee (IOC) and took place from April 6 to April 15, 1896.'

In [6]:
df = execute_sql(f"""SELECT JSON_PRETTY(sys.ML_GENERATE("How many Olympians participated in the 2024 Games?", NULL));""")
json.loads(df.iat[0,0])["text"]

"I don't have information about the specific number of Olympians who will participate in the 2024 Games, as my training data only goes up until 2022 and I do not have real-time access to current events or future information. However, I can suggest checking the official Olympic website or other reliable news sources for the most up-to-date information on the participants of the 2024 Paris Olympics."

The model is unable to answer the above question because its training data extends only up to 2022.

Let's provide the model with up-to-date information by using the 'context' option in the command.

In [7]:
df = execute_sql(f"""SELECT JSON_PRETTY(sys.ML_GENERATE("How many Olympians participated in the 2024 Games?", JSON_OBJECT("context", "Paris has been the host for the 2024 Olympic and Paralympic Games. Over the course of the summer, the city – and other venues across France – have welcomed 10,500 Olympians (competing in 32 sports) and over 4,000 Paralympians (competing in 22 sports).")));""")
json.loads(df.iat[0,0])["text"]

'According to the text, there were 10,500 Olympians who competed in the 2024 Olympic Games.'