<a href="https://colab.research.google.com/github/marcelotournier/llm_notebook_playground/blob/main/Mistral7_pubmed_summarizer_gguf.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Summarizing Pubmed News about Type 2 Diabetes

This notebook contains a demo of an LLM summarizing the newest papers on Type 2 Diabetes from a Pubmed RSS Feed

### Config:
- Select the menu "Runtime", then "Change runtime type"
- In the option "Hardware accelerator" choose "T4 GPU"

### Using:
- `generate(prompt)` will give you the whole response at once (can take a while)
- `stream(prompt)` will print one token per time, as in ChatGPT

In [1]:
# Install deps
!pip install ctransformers[cuda] --quiet

In [2]:
# Config params - do not touch here unless you know what you are doing.

MODEL = "TheBloke/SciPhi-Mistral-7B-32k-GGUF"
GGML = 'sciphi-mistral-7b-32k.Q8_0.gguf'
GPU_LAYERS = 99999
SYS_PROMPT = """SYSTEM: You are a helpful, respectful and honest assistant.
Always answer as helpfully as possible, while being safe.
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.
If you don't know the answer to a question, please don't share false information.
Summarize the findings of this XML string containing scientific articles into a blog post article. Include the references whenever convenient.
USER: {prompt}
ASSISTANT:"""

In [3]:
# Construting model - do not touch here unless you know what you are doing.
from ctransformers import AutoModelForCausalLM


llm = AutoModelForCausalLM.from_pretrained(
      MODEL,
      model_file=GGML,
      model_type="mistral",
      context_length = 32000,
      gpu_layers=GPU_LAYERS,
      stream=True,
      max_new_tokens=1024,
      temperature=0.2,
      repetition_penalty=1.3,
      )


def generate(prompt):
    return llm(SYS_PROMPT.format(prompt=prompt),
               stream=False)


def stream(prompt):
    for token in llm(SYS_PROMPT.format(prompt=prompt)):
        print(token, end='', flush=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

In [4]:
# Testing the model
llm("tell me about diatetes", stream=False)

'\n\n# Type 1 diabetes: A chronic condition that requires insulin injections or infusions to manage blood sugar levels. It is typically diagnosed in children and young adults, but can occur at any age. Symptoms may include frequent urination, excessive thirst, extreme hunger, fatigue, weight loss despite increased appetite, and ketones in the urine.\n\nType 2 diabetes: A chronic condition that affects how your body uses insulin to manage blood sugar levels. It is typically diagnosed in adults over age 40 but can occur at any age. Symptoms may include frequent urination, excessive thirst, extreme hunger, fatigue, weight loss despite increased appetite, and ketones in the urine.\n\nBoth types of diabetes require careful monitoring of blood sugar levels through regular testing and adjustments to insulin dosages or other medications as needed. Proper diet, exercise, and lifestyle changes are also essential for managing diabetes effectively.'

# Usage examples:

You can create your own Pubmed RSS and change the variable `pubmed_rss` below with your link. Restrict the Pubmed RSS to display only the 5 most recent articles, for better results

In [5]:
import requests

pubmed_rss = "https://pubmed.ncbi.nlm.nih.gov/rss/search/1PQPGz2gzLgGzu9Ukv6Gc6AJfbcOKwnGqdz9iHxCjqhaIYdM-F/?limit=10&utm_campaign=pubmed-2&fc=20240306113136"

In [6]:
text = requests.get(pubmed_rss).text

In [None]:
# This takes a long time - I think it is due to the text size in the input
stream(text)

In [None]:
# This takes a long time - I think it is due to the text size in the input
generate(text)