<a href="https://colab.research.google.com/github/look4pritam/RetrievalAugmentedGeneration/blob/master/Notebooks/RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG

### Set locale.

In [1]:
import locale
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

'en_US.UTF-8'

### Install Python packages.

In [1]:
!pip3 install accelerate==0.21.0

Collecting accelerate==0.21.0
  Downloading accelerate-0.21.0-py3-none-any.whl (244 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/244.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━[0m [32m235.5/244.2 kB[0m [31m7.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m244.2/244.2 kB[0m [31m6.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: accelerate
Successfully installed accelerate-0.21.0


In [2]:
!pip3 install bitsandbytes==0.40.2

Collecting bitsandbytes==0.40.2
  Downloading bitsandbytes-0.40.2-py3-none-any.whl (92.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.5/92.5 MB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: bitsandbytes
Successfully installed bitsandbytes-0.40.2


In [3]:
!pip3 install transformers==4.38.1



### Create Tokenizer.

In [4]:
from transformers import AutoTokenizer

In [5]:
model_name = "mistralai/Mistral-7B-Instruct-v0.1"

In [6]:
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = 'right'

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/1.47k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

### Set 'bitsandbytes' parameters.

In [7]:
# Activate 4-bit precision base model loading
use_4bit = True

# Compute dtype for 4-bit base models
bnb_4bit_compute_dtype = 'float16'

# Quantization type (fp4 or nf4)
bnb_4bit_quant_type = 'nf4'

# Activate nested quantization for 4-bit base models (double quantization)
use_nested_quant = False

### Set up quantization config.

In [8]:
import torch

In [9]:
from transformers import BitsAndBytesConfig

In [10]:
compute_dtype = getattr(torch, bnb_4bit_compute_dtype)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=use_4bit,
    bnb_4bit_quant_type=bnb_4bit_quant_type,
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=use_nested_quant,
)

### Load pre-trained model.

In [11]:
from transformers import AutoModelForCausalLM

In [12]:
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config
)

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

`low_cpu_mem_usage` was None, now set to True since model is quantized.


model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.94G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it. If you want to save 4-bit models, make sure to have `bitsandbytes>=0.41.3` installed.


### Check Mistral 7B.

In [None]:
inputs_not_chat = tokenizer.encode_plus("[INST] Tell me about fantasy football? [/INST]", return_tensors="pt")['input_ids'].to('cuda')

generated_ids = model.generate(inputs_not_chat,
                               max_new_tokens=1000,
                               do_sample=True)

decoded = tokenizer.batch_decode(generated_ids)

In [None]:
print(decoded)

### Create 'text-generation' pipeline.

In [13]:
from transformers import pipeline

In [14]:
text_generation_pipeline = pipeline(
    model=model,
    tokenizer=tokenizer,
    task="text-generation",
    temperature=0.2,
    repetition_penalty=1.1,
    return_full_text=True,
    max_new_tokens=1000,
)

In [15]:
!pip3 install langchain

Collecting langchain
  Downloading langchain-0.1.10-py3-none-any.whl (806 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m806.2/806.2 kB[0m [31m11.7 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting langchain-community<0.1,>=0.0.25 (from langchain)
  Downloading langchain_community-0.0.25-py3-none-any.whl (1.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m24.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-core<0.2,>=0.1.28 (from langchain)
  Downloading langchain_core-0.1.28-py3-none-any.whl (252 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m252.4/252.4 kB[0m [31m20.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-text-splitters<0.1,>=0.0.1 (from langchain)
  Downlo

In [17]:
from langchain.llms import HuggingFacePipeline

In [18]:
mistral_llm = HuggingFacePipeline(pipeline=text_generation_pipeline)

In [19]:
!pip3 install playwright

Collecting playwright
  Downloading playwright-1.41.2-py3-none-manylinux1_x86_64.whl (37.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m37.4/37.4 MB[0m [31m18.6 MB/s[0m eta [36m0:00:00[0m
Collecting pyee==11.0.1 (from playwright)
  Downloading pyee-11.0.1-py3-none-any.whl (15 kB)
Installing collected packages: pyee, playwright
Successfully installed playwright-1.41.2 pyee-11.0.1


In [20]:
!playwright install
!playwright install-deps

Downloading Chromium 121.0.6167.57 (playwright build v1097)[2m from https://playwright.azureedge.net/builds/chromium/1097/chromium-linux.zip[22m
[1G152.8 MiB [] 0% 0.0s[0K[1G152.8 MiB [] 0% 25.0s[0K[1G152.8 MiB [] 0% 10.0s[0K[1G152.8 MiB [] 0% 9.9s[0K[1G152.8 MiB [] 0% 9.3s[0K[1G152.8 MiB [] 0% 8.9s[0K[1G152.8 MiB [] 1% 7.1s[0K[1G152.8 MiB [] 1% 6.8s[0K[1G152.8 MiB [] 1% 6.7s[0K[1G152.8 MiB [] 2% 6.6s[0K[1G152.8 MiB [] 2% 6.8s[0K[1G152.8 MiB [] 2% 6.5s[0K[1G152.8 MiB [] 3% 6.8s[0K[1G152.8 MiB [] 3% 6.5s[0K[1G152.8 MiB [] 3% 6.2s[0K[1G152.8 MiB [] 4% 6.2s[0K[1G152.8 MiB [] 4% 6.3s[0K[1G152.8 MiB [] 4% 6.2s[0K[1G152.8 MiB [] 5% 6.1s[0K[1G152.8 MiB [] 5% 6.2s[0K[1G152.8 MiB [] 5% 6.1s[0K[1G152.8 MiB [] 6% 6.0s[0K[1G152.8 MiB [] 6% 5.7s[0K[1G152.8 MiB [] 7% 5.7s[0K[1G152.8 MiB [] 7% 5.5s[0K[1G152.8 MiB [] 7% 5.6s[0K[1G152.8 MiB [] 8% 5.3s[0K[1G152.8 MiB [] 8% 5.2s[0K[1G152.8 MiB [] 8% 5.3s[0K[1G152.8 MiB [] 9% 5.2s[0K[1G152.8 M

In [22]:
from langchain.document_loaders import AsyncChromiumLoader

In [23]:
import nest_asyncio
nest_asyncio.apply()

# Articles to index
articles = ["https://www.fantasypros.com/2023/11/rival-fantasy-nfl-week-10/",
            "https://www.fantasypros.com/2023/11/5-stats-to-know-before-setting-your-fantasy-lineup-week-10/",
            "https://www.fantasypros.com/2023/11/nfl-week-10-sleeper-picks-player-predictions-2023/",
            "https://www.fantasypros.com/2023/11/nfl-dfs-week-10-stacking-advice-picks-2023-fantasy-football/",
            "https://www.fantasypros.com/2023/11/players-to-buy-low-sell-high-trade-advice-2023-fantasy-football/"]

# Scrapes the blogs above
loader = AsyncChromiumLoader(articles)
docs = loader.load()

### Converts HTML to plain text.

In [27]:
from langchain.document_transformers import Html2TextTransformer

In [28]:
html2text = Html2TextTransformer()
docs_transformed = html2text.transform_documents(docs)

ImportError: html2text package not found, please 
                install it with `pip install html2text`

In [36]:
html2text2 = Html2TextTransformer()

In [35]:
!pip3 install html2text

Collecting html2text
  Downloading html2text-2024.2.26.tar.gz (56 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.5/56.5 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: html2text
  Building wheel for html2text (setup.py) ... [?25l[?25hdone
  Created wheel for html2text: filename=html2text-2024.2.26-py3-none-any.whl size=33110 sha256=fc4940e29f292108648af157cadcf28bd92ef7ac653494216e8996bc03cd46f0
  Stored in directory: /root/.cache/pip/wheels/f3/96/6d/a7eba8f80d31cbd188a2787b81514d82fc5ae6943c44777659
Successfully built html2text
Installing collected packages: html2text
Successfully installed html2text-2024.2.26


In [37]:
docs_transformed = html2text2.transform_documents(docs)

In [38]:
from langchain.text_splitter import CharacterTextSplitter

In [39]:
# Chunk text
text_splitter = CharacterTextSplitter(chunk_size=100,
                                      chunk_overlap=0)
chunked_documents = text_splitter.split_documents(docs_transformed)



In [44]:
!pip3 install sentence_transformers

Collecting sentence_transformers
  Downloading sentence_transformers-2.5.1-py3-none-any.whl (156 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m156.5/156.5 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: sentence_transformers
Successfully installed sentence_transformers-2.5.1


In [47]:
!pip3 install faiss-cpu

Collecting faiss-cpu
  Downloading faiss_cpu-1.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.0/27.0 MB[0m [31m33.8 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: faiss-cpu
Successfully installed faiss-cpu-1.8.0


In [48]:
from langchain.vectorstores import FAISS

In [42]:
from langchain.embeddings.huggingface import HuggingFaceEmbeddings

In [49]:
# Load chunked documents into the FAISS index
db = FAISS.from_documents(chunked_documents,
                          HuggingFaceEmbeddings(model_name='sentence-transformers/all-mpnet-base-v2'))

retriever = db.as_retriever()

### Create PromptTemplate and LLMChain

In [50]:
from langchain.prompts import PromptTemplate

In [51]:
from langchain.chains import LLMChain

In [52]:
prompt_template = """
### [INST] Instruction: Answer the question based on your fantasy football knowledge. Here is context to help:

{context}

### QUESTION:
{question} [/INST]
 """

# Create prompt from prompt template
prompt = PromptTemplate(
    input_variables=["context", "question"],
    template=prompt_template,
)

# Create llm chain
llm_chain = LLMChain(llm=mistral_llm, prompt=prompt)

### Build RAG Chain

In [53]:
from langchain.schema.runnable import RunnablePassthrough

In [54]:
rag_chain = (
 {"context": retriever, "question": RunnablePassthrough()}
    | llm_chain
)

In [55]:
result = rag_chain.invoke("Should I start Gibbs next week for fantasy?")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [56]:
result['context']

[Document(page_content='This week, Harris faces the bottom-of-the-barrel Packers’ run defense that\nallows the ninth-most fantasy points per game to the running back position.\nHarris will give you a higher-volume RB with a low rostership percentage this\nweek.', metadata={'source': 'https://www.fantasypros.com/2023/11/nfl-dfs-week-10-stacking-advice-picks-2023-fantasy-football/'}),
 Document(page_content='could start cutting into his workload. Furthermore, his rest of the season\nschedule isn’t fantasy-friendly. Try to flip Edwards and a WR3 for Kenneth\nWalker or Tony Pollard', metadata={'source': 'https://www.fantasypros.com/2023/11/players-to-buy-low-sell-high-trade-advice-2023-fantasy-football/'}),
 Document(page_content='“**Gus Edwards** has been on fire lately. He is the RB1 over the past three\nweeks, averaging 22.2 half-point PPR fantasy points and two rushing touchdowns\nper game. However, over 54% of his fantasy production came from the six\nrushing touchdowns. Meanwhile, th

In [57]:
print(result['text'])


Based on the information provided, it seems like there are some other options available for your fantasy team that may provide better value than starting Gibbs next week. 

Firstly, Gus Edwards has been performing well recently and could be a good option to consider. He has been the RB1 over the past three weeks and has averaged 22.2 half-point PPR fantasy points per game. Additionally, he has scored two rushing touchdowns per game. However, it's worth noting that over 54% of his fantasy production comes from the six rushing touchdowns he scored. 

Another option to consider is Keaton Mitchell, who has been performing well so far this season. He has averaged 7.8 fantasy points per game and has scored a touchdown in each of the last three games. 

Finally, if you're looking for a high-volume RB with a low rostership percentage, you might want to consider Kenneth Walker or Tony Pollard. Both players have been performing well this season and could provide some upside for your team. 

Ult