# Text to SQL Query Helper

Hugging Face Transformers is an open-source framework for deep learning created by Hugging Face.
It provides APIs and tools to download state-of-the-art pre-trained models and further tune them to maximize performance.
These models support common tasks in different modalities, such as natural language processing, computer vision, audio, and multi-modal applications.
Using pretrained models can reduce your compute costs, carbon footprint,
and save you the time and resources required to train a model from scratch.

https://huggingface.co/docs/transformers/index
https://huggingface.co/docs/hub/index

Accelerate library to help users easily train a 🤗 Transformers model on any type of distributed setup,
whether it is multiple GPU's on one machine or multiple GPU's across several machines.

## Logging in to Hugging Face Hub

In [1]:
import os
from huggingface_hub import login

login(os.getenv("HUGGINGFACEHUB_API_TOKEN"))

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /Users/suvosmac/.cache/huggingface/token
Login successful


## Essential Library Imports

In [2]:
# Class provides functionalitis related to hugging face transformers pipelines
from langchain import HuggingFacePipeline

# This line imports the AutoTokenizer class from the transformers library.
# The AutoTokenizer class is used to load tokenizers for various pre-trained language models available in the Hugging Face model hub.
from transformers import AutoTokenizer

# This line imports the entire transformers library, which is a popular library developed by
# Hugging Face for working with various transformer-based models in natural language processing (NLP),
# including both models and tokenizers.
import transformers

# Import the torch library
import torch

## Initialize Model and Pipeline

In [4]:
model = "meta-llama/Llama-2-7b-chat-hf"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model)

# Setup the text generation pipeline
pipeline = transformers.pipeline("text-generation", model=model, tokenizer=tokenizer, torch_dtype=None,
                                 device_map="auto", max_new_tokens=512, do_sample=True,
                                 top_k=10, num_return_sequences=1,
                                 eos_token_id=tokenizer.eos_token_id)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)neration_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

In [5]:
# 'HuggingFacePipeline' class creates a custom pipeline for text generation, and we are passing
# the pipeline that we defined earlier along with some model-specific keyword arguments - temperature here.

llm = HuggingFacePipeline(pipeline=pipeline, model_kwargs={'temperature': 0})

## Define the Prompt and Initialize the LLMChain

In [6]:
from langchain import PromptTemplate, LLMChain

template = """
            Create a SQL Query snippet using the below text:
            ```{text}```
            Just SQL Quer:
           """

prompt = PromptTemplate(template=template, input_variables={"text"})

llmchain = LLMChain(prompt=prompt, llm=llm)

text = """Extract all the unique values from "age" columm
        """

## Execute the LLMChain

In [7]:
print(llmchain.run(text)) # This needs a good GPU load to execute ...

KeyboardInterrupt: 