# Process and analyses text with Built In Databricks SQL AI functions

Databricks SQL provides built-in GenAI capabilities, letting you perform adhoc operation, leveraging state of the art LLM, optimized for these tasks.  It use Mixtral behind the scenes. 

These functions are the following:

- ai_analyze_sentiment
- ai_classify
- ai_extract
- ai_fix_grammar
- ai_gen
- ai_mask
- ai_similarity
- ai_summarize
- ai_translate

Using these functions is pretty straightforward. Under the hood, they call specialized LLMs with a custom prompt, providing fast answers.

You can use them on any column in any SQL text. This demo runs using a SQL Warehouse, so please make sure you have SQL wharehouse ready!

Credit for my knowledge on this topic goes to https://notebooks.databricks.com/demos/sql-ai-functions/index.html# 

You can download this from https://github.com/yogi-playground/Databricks/tree/main/Databricks%20AI%20Function 

Import required packages

In [0]:
# Install the Databricks LLM AI functions module
%pip install dbdemos

[43mNote: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.[0m
Collecting dbdemos
  Downloading dbdemos-0.5.3-py3-none-any.whl (35.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 35.9/35.9 MB 21.9 MB/s eta 0:00:00
Installing collected packages: dbdemos
Successfully installed dbdemos-0.5.3
[43mNote: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.[0m


In [0]:
#import and install databricks LLM AI funcation module dbdemos 
import dbdemos
dbdemos.install('sql-ai-functions')

Installing demo sql-ai-functions under /Shared/AMZ_Review, please wait...
Help us improving dbdemos, share your feedback or create an issue if something isn't working: https://github.com/databricks-demos/dbdemos


[0;31m---------------------------------------------------------------------------[0m
[0;31mExistingResourceException[0m                 Traceback (most recent call last)
File [0;32m<command-1174645211342129>, line 3[0m
[1;32m      1[0m [38;5;66;03m#import and install databricks LLM AI funcation module dbdemos [39;00m
[1;32m      2[0m [38;5;28;01mimport[39;00m [38;5;21;01mdbdemos[39;00m
[0;32m----> 3[0m dbdemos[38;5;241m.[39minstall([38;5;124m'[39m[38;5;124msql-ai-functions[39m[38;5;124m'[39m)

File [0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-aa7e71b1-ac65-4c71-9f14-22d891e24934/lib/python3.10/site-packages/dbdemos/dbdemos.py:225[0m, in [0;36minstall[0;34m(demo_name, path, overwrite, username, pat_token, workspace_url, skip_dashboards, cloud, start_cluster, use_current_cluster, current_cluster_id, install_dashboard_sequentially, debug, catalog, schema, serverless)[0m
[1;32m    222[0m [38;5;28;01mif[39;00m [38;5;129;01mnot[39;00m installer[38;5;

### Using a SQL Warehouse to run this demo
### 
This demo runs using a SQL Warehouse!

Make sure you select one using the dropdown on the top right of your notebook (don't select a classic compute/cluster)

In [0]:
%sql
-- verify that we're running on a SQL Warehouse
SELECT assert_true(current_version().dbsql_version is not null, 'YOU MUST USE A SQL WAREHOUSE, not a cluster');

"assert_true((current_version().dbsql_version IS NOT NULL), YOU MUST USE A SQL WAREHOUSE, not a cluster)"
""


In [0]:
%sql
-- Verify Gen AI installation completed sucessfully and work as expected, you can use this line to test your installation same as you use chatGPT or any LLM application. 
SELECT ai_gen('Generate a concise, cheerful email title for a summer bike sale with 20% discount');

"ai_gen(Generate a concise, cheerful email title for a summer bike sale with 20% discount)"
🎉 Summer Bike Sale: Grab Your Dream Bike at 20% Off! 🚲☀️


### 1. **ai_analyze_sentiment:** 
Analyzes the sentiment of a given text, determining whether the expressed opinion is positive, negative, or neutral.

  Eample 1
   - Input: "I love the new features in the latest software update! It's so user-friendly and efficient."
   - Output: "Positive"
   
  Eample 2
   - Input: "After installing update, PC starting slow."
   - Output: "Negative"

In [0]:
%sql
select ai_analyze_sentiment("I love the new features in the latest software update! It's so user-friendly and efficient.")

ai_analyze_sentiment(I love the new features in the latest software update! It's so user-friendly and efficient but my system is slow now.)
mixed


### 2. ai_classify: 
Categorizes text or data into predefined classes or labels based on its content
**ai_classify**:
   - Input: "This email is regarding the annual budget meeting."
   - Output: "Category: Business Communication"

In [0]:
%sql
SELECT ai_classify("My password is leaked.", ARRAY("urgent", "not urgent"));

"ai_classify(My password is leaked., array(urgent, not urgent))"
urgent


### 3. ai_extract: 
Extracts specific information, such as entities, keywords, or other relevant data, from a given text.

Eample:
  - Input: "John Doe will be attending the conference in New York on September 12th."
  - Output: "Entities: [Name: John Doe, Event: conference, Location: New York, Date: September 12th]"

In [0]:
%sql
 SELECT ai_extract(
    'John Doe lives in New York and works for Acme Corp.',
    array('person', 'location', 'organization')
  );
  SELECT ai_extract(
    'Send an email to jane.doe@example.com about the meeting at 10am.',
    array('email', 'time')
  );

"ai_extract(John Doe lives in New York and works for Acme Corp., array(person, location, organization))"
"Map(person -> John Doe, location -> New York, organization -> Acme Corp)"


"ai_extract(Send an email to jane.doe@example.com about the meeting at 10am., array(email, time))"
"Map(email -> jane.doe@example.com, time -> 10am)"


### 4. ai_fix_grammar: 
Identifies and corrects grammatical errors in a text, ensuring proper syntax and language usage.

Eample:
  - Input: "He dont like to eat vegetables."
  - Output: "He doesn't like to eat vegetables."

In [0]:
%sql
SELECT ai_fix_grammar('He dont like to eat vegetables');

ai_fix_grammar(He dont like to eat vegetables)
He doesn't like to eat vegetables


In [0]:
%sql
SELECT ai_fix_grammar('This sentence have some mistake');

SELECT ai_fix_grammar('She dont know what to did.');

SELECT ai_fix_grammar('He go to school every days.');


ai_fix_grammar(This sentence have some mistake)
This sentence has some mistakes


ai_fix_grammar(She dont know what to did.)
She doesn't know what to do.


ai_fix_grammar(He go to school every days.)
He goes to school every day.


### 5. ai_gen: 
Generates text based on a given prompt or context, creating coherent and contextually relevant content.

Eample:
  - Input: "Write a short story about a brave knight."
  - Output: "Once upon a time, in a land far away, there was a brave knight named Sir Gallant. He embarked on a quest to rescue the princess from a formidable dragon. With courage and skill, Sir Gallant defeated the dragon and saved the princess, bringing peace to the kingdom."

In [0]:
%sql
select ai_gen('Write a short story about a brave knight.')

In [0]:
%sql
SELECT ai_gen('Generate a concise, cheerful email title for a summer bike sale with 20% discount');


"ai_gen(Generate a concise, cheerful email title for a summer bike sale with 20% discount)"
🎉 Summer Bike Sale: Grab Your Dream Bike at 20% Off! 🚲☀️


### 6. ai_mask: 
Masks or hides sensitive or specific information within a text, such as personal data or confidential details.

Example:
  - Input: "Contact me at john.doe@example.com for more information."
  - Output: "Contact me at [email protected] for more information."


In [0]:
%sql
SELECT ai_mask('John Doe lives in New York. His email is john.doe@example.com.', array('person', 'email')  );

"ai_mask(John Doe lives in New York. His email is john.doe@example.com., array(person, email))"
[MASKED] lives in New York. His email is [MASKED].


In [0]:
%sql
SELECT ai_mask(
'Contact me at 555-1234 or visit us at 123 Main St.',
array('phone', 'address')
);

"ai_mask(Contact me at 555-1234 or visit us at 123 Main St., array(phone, address))"
Contact me at [MASKED] or visit us at [MASKED]


### 7. ai_similarity: 
Computes the similarity score between two texts or data points, indicating how closely they are related or match.

Example:
  - Input: "Databricks" and "Apache Spark"
  - Output: "Similarity Score: 0.75" (Note: This is an illustrative score; actual computation may vary)

In [0]:
%sql
SELECT ai_similarity('Databricks', 'Apache Spark'),  ai_similarity('Apache Spark', 'The Apache Spark Engine');

"ai_similarity(Databricks, Apache Spark)","ai_similarity(Apache Spark, The Apache Spark Engine)"
0.6715619,0.83821434


### 8. ai_summarize: 
Condenses a long text into a shorter version, capturing the main points and essential information.

Eample:
  - Input: "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines. It has become an essential part of the technology industry. Research associated with artificial intelligence is highly technical and specialized. The core problems of artificial intelligence include programming computers for certain traits such as knowledge, reasoning, problem-solving, perception, learning, and planning."
  - Output: "AI is a computer science branch focused on creating intelligent machines, essential in tech, involving knowledge, reasoning, problem-solving, perception, learning, and planning."


In [0]:
%sql
select ai_summarize('Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines. It has become an essential part of the technology industry. Research associated with artificial intelligence is highly technical and specialized. The core problems of artificial intelligence include programming computers for certain traits such as knowledge, reasoning, problem-solving, perception, learning, and planning.')

"ai_summarize(Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines. It has become an essential part of the technology industry. Research associated with artificial intelligence is highly technical and specialized. The core problems of artificial intelligence include programming computers for certain traits such as knowledge, reasoning, problem-solving, perception, learning, and planning., 50)"
"Artificial Intelligence (AI) is a computer science field creating intelligent machines, a key part of the tech industry. AI research focuses on programming computers for traits like knowledge, reasoning, problem-solving, perception, learning, and planning."


In [0]:
%sql
select ai_summarize('Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines. It has become an essential part of the technology industry. Research associated with artificial intelligence is highly technical and specialized. The core problems of artificial intelligence include programming computers for certain traits such as knowledge, reasoning, problem-solving, perception, learning, and planning.',20)

"ai_summarize(Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines. It has become an essential part of the technology industry. Research associated with artificial intelligence is highly technical and specialized. The core problems of artificial intelligence include programming computers for certain traits such as knowledge, reasoning, problem-solving, perception, learning, and planning., 20)"
"Artificial Intelligence is a specialized field of computer science focused on creating intelligent machines, addressing problems like knowledge representation, reasoning, and learning."


###  9. ai_translate: 
 Translates text from one language to another while maintaining the original meaning and context.

Eample:
  - Input: "Hello, how are you?" (English)
  - Output: "Hola, ¿cómo estás?" (Spanish)

In [0]:
%sql
SELECT ai_translate("This function is so amazing!", "fr");
SELECT ai_translate("This function is so amazing!", "hi")

"ai_translate(This function is so amazing!, fr)"
Cette fonction est tellement géniale !


"ai_translate(This function is so amazing!, hi)"
Yeh function itni amazeengi hai!


### Bonus Info

-- Notebook sql cell is now become smart cell  with autocompletion and syntax highlighting, Also it provide suggestion for writing comments 