#SQLCoder-7b-2
Run the cells below to run inference on our text-to-SQL LLM: SQLCoder-7b-2.

⭐️ [Github Repo](https://github.com/defog-ai/sqlcoder)

🤗 [Huggingface Page](https://huggingface.co/defog/sqlcoder-7b-2)

##Setup

In [1]:
!pip install torch transformers bitsandbytes accelerate sqlparse

Collecting bitsandbytes
  Downloading bitsandbytes-0.47.0-py3-none-manylinux_2_24_x86_64.whl.metadata (11 kB)
Downloading bitsandbytes-0.47.0-py3-none-manylinux_2_24_x86_64.whl (61.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.3/61.3 MB[0m [31m20.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: bitsandbytes
Successfully installed bitsandbytes-0.47.0


In [2]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

In [3]:
torch.cuda.is_available()

True

##Download the Model
Use any model on Colab (or any system with >30GB VRAM on your own machine) to load this in f16. If unavailable, use a GPU with minimum 8GB VRAM to load this in 8bit, or with minimum 5GB of VRAM to load in 4bit.

This step can take around 5 minutes the first time. So please be patient :)

In [4]:
model_name = "defog/sqlcoder-7b-2"
tokenizer = AutoTokenizer.from_pretrained(model_name)

    # if you have atleast 15GB of GPU memory, run load the model in float16
model = AutoModelForCausalLM.from_pretrained(
        model_name,
        trust_remote_code=True,
        torch_dtype=torch.float16,
        device_map="auto",
        use_cache=True,
    )


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/515 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/691 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/4.95G [00:00<?, ?B/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.94G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/3.59G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

##Set the Question & Prompt and Tokenize
Feel free to change the schema in the prompt below to your own schema

In [5]:
prompt = """### Task
Generate a SQL query to answer [QUESTION]{question}[/QUESTION]

### Instructions
- If you cannot answer the question with the available database schema, return 'Check EspnCricInfo'
- Remember that battingaverage is sum of Runs divided by number of matches
- Remember that Century is Runs greater than or equal to 100


### Database Schema
This query will run on a database whose schema is represented in this string:
CREATE TABLE Scores (
  MatchID INTEGER PRIMARY KEY, -- Unique ID for each Match
  Opposition VARCHAR(50), -- Name of cricket team
  Innings INTEGER, -- batted first or second
  Runs INTEGER  -- Runs Scored in the match
);


"""

##Generate the SQL
This can be excruciatingly slow on a T4 in Colab, and can take 10-20 seconds per query. On faster GPUs, this will take ~1-2 seconds

Ideally, you should use `num_beams`=4 for best results. But because of memory constraints, we will stick to just 1 for now.

In [6]:
import sqlparse

def generate_query(question):
    updated_prompt = prompt.format(question=question)
    inputs = tokenizer(updated_prompt, return_tensors="pt").to("cuda")
    generated_ids = model.generate(
        **inputs,
        num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.eos_token_id,
        max_new_tokens=400,
        do_sample=False,
        num_beams=1,
    )
    outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)

    torch.cuda.empty_cache()
    torch.cuda.synchronize()
    # empty cache so that you do generate more results w/o memory crashing
    # particularly important on Colab – memory management is much more straightforward
    # when running on an inference service
    return sqlparse.format(outputs[0].split("[SQL]")[-1], reindent=True)

In [7]:
question = "Total Runs Scored against each opposition"
generated_sql = generate_query(question)
print(generated_sql)

### Task Generate a SQL query to answer [QUESTION]Total Runs Scored against each opposition[/QUESTION] ### Instructions - If you cannot answer the question with the available database schema,
                                                                                                                                                                                        return 'Check EspnCricInfo' - Remember that battingaverage is sum of Runs divided by number of matches - Remember that Century is Runs greater than
or equal to 100 ### Database Schema This query will run on a database whose schema is represented in this string:
CREATE TABLE Scores (MatchID INTEGER PRIMARY KEY, -- Unique ID for each Match
 Opposition VARCHAR(50), -- Name of cricket team
 Innings INTEGER, -- batted first or second
 Runs INTEGER -- Runs Scored in the match
);


SELECT s.Opposition,
       SUM(s.Runs) AS total_runs,
       COUNT(s.MatchID) AS matches_played,
       AVG(s.Runs) AS batting_average,
       

In [8]:
question = "Runs scored during First Innings"
generated_sql = generate_query(question)
print(generated_sql)

### Task Generate a SQL query to answer [QUESTION]Runs scored during First Innings[/QUESTION] ### Instructions - If you cannot answer the question with the available database schema,
                                                                                                                                                                               return 'Check EspnCricInfo' - Remember that battingaverage is sum of Runs divided by number of matches - Remember that Century is Runs greater than
or equal to 100 ### Database Schema This query will run on a database whose schema is represented in this string:
CREATE TABLE Scores (MatchID INTEGER PRIMARY KEY, -- Unique ID for each Match
 Opposition VARCHAR(50), -- Name of cricket team
 Innings INTEGER, -- batted first or second
 Runs INTEGER -- Runs Scored in the match
);


SELECT SUM(s.Runs) AS total_runs
FROM Scores s
WHERE s.Innings = 1;


In [9]:
question = "BattingAverage"
generated_sql = generate_query(question)
print(generated_sql)

### Task Generate a SQL query to answer [QUESTION]BattingAverage[/QUESTION] ### Instructions - If you cannot answer the question with the available database schema,
                                                                                                                                                             return 'Check EspnCricInfo' - Remember that battingaverage is sum of Runs divided by number of matches - Remember that Century is Runs greater than
or equal to 100 ### Database Schema This query will run on a database whose schema is represented in this string:
CREATE TABLE Scores (MatchID INTEGER PRIMARY KEY, -- Unique ID for each Match
 Opposition VARCHAR(50), -- Name of cricket team
 Innings INTEGER, -- batted first or second
 Runs INTEGER -- Runs Scored in the match
);


SELECT CAST(SUM(s.Runs) AS FLOAT) / NULLIF(COUNT(s.MatchID), 0) AS BattingAverage
FROM Scores s;


In [10]:
question = "List of Matches where scores was century"
generated_sql = generate_query(question)
print(generated_sql)

### Task Generate a SQL query to answer [QUESTION]List of Matches
where scores was century[/QUESTION] ### Instructions - If you cannot answer the question with the available database schema,
                                                                                                                     return 'Check EspnCricInfo' - Remember that battingaverage is sum of Runs divided by number of matches - Remember that Century is Runs greater than
  or equal to 100 ### Database Schema This query will run on a database whose schema is represented in this string:
  CREATE TABLE Scores (MatchID INTEGER PRIMARY KEY, -- Unique ID for each Match
 Opposition VARCHAR(50), -- Name of cricket team
 Innings INTEGER, -- batted first or second
 Runs INTEGER -- Runs Scored in the match
);


SELECT s.MatchID,
       s.Opposition,
       s.Innings,
       s.Runs
FROM Scores s
WHERE s.Runs >= 100
