<a href="https://colab.research.google.com/github/marcelotournier/llm_notebook_playground/blob/main/Mistral7_pubmed_summarizer_gguf.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Summarizing Pubmed Automated Article RSS feeds about Type 2 Diabetes

This notebook contains a demo of an LLM summarizing the newest papers on Type 2 Diabetes from a Pubmed RSS Feed

You can create your own RSS feeds and test with this code, using the guide at https://www.nlm.nih.gov/pubs/techbull/mj05/mj05_rss.html

It is recommended that RSS feeds don't have more than 5 articles, due to the limitations of the model token context length and GPU power.

### Config:
- Select the menu "Runtime", then "Change runtime type"
- In the option "Hardware accelerator" choose "T4 GPU"

### Using:
- `generate(prompt)` will give you the whole response at once (can take a while)
- `stream(prompt)` will print one token per time, as in ChatGPT

In [1]:
# Install llama-cpp-python bindings for CUDA GPU
# "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"
!CMAKE_ARGS="-DLLAMA_CUBLAS=ON" \
  pip install llama-cpp-python #--quiet

Collecting llama-cpp-python
  Using cached llama_cpp_python-0.2.56.tar.gz (36.9 MB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Collecting diskcache>=5.6.1 (from llama-cpp-python)
  Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... [?25ldone
[?25h  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.2.56-cp310-cp310-manylinux_2_35_x86_64.whl size=26411249 sha256=6cae44e42b195ab34f4ff6467b8e595447cc530fca9c0b8577e3539d00930b33
  Stored in directory: /home/mt/.cache/pip/wheels/e5/09/9d/c413053f6258cb2546cc792418c595e276f9efd5db31a80377
Successfully built llama-cpp-python
Installing collected packages: diskcache, llama-cpp-python
Successfully installed diskcache-5.6.3 llama-cpp-python

In [110]:
# Config params - do not touch here unless you know what you are doing.

INPUT_MODEL_PATH = "https://huggingface.co/TheBloke/SciPhi-Mistral-7B-32k-GGUF/resolve/main/sciphi-mistral-7b-32k.Q4_K_M.gguf"
OUTPUT_MODEL_PATH = "downloaded_model.gguf"
GPU_LAYERS = 999
MAX_TOKENS_IN_RESPONSE = 2048
SYS_PROMPT = """You are a helpful, respectful and honest assistant.
Always answer as helpfully as possible, while being safe.
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.
If you don't know the answer to a question, please don't share false information.
The task is to process a text with many articles and summarize the findings of all articles into a single text.
Rewrite the text in a blog post that is easy to read.
ALWAYS, WITHOUT ANY EXCEPTION, FINISH YOUR ANSWER WITH THE TAG [END].
Here is an example:
### INPUT:
Study Bork
Joe Doe
Jane Dane
This study found that apples are good
Study Xomp
Mac Owey
Derek Rex
Oranges are better than apples
### RESPONSE:
The study Bork from Joe Doe and Jane Dane found that apples are good, although it seems that oranges are better than apples, according to Researchers Owey and Rex.
[END]
Now do the same for the following input
INPUT: {prompt}
### RESPONSE:
"""
USE_CACHE = True

In [2]:
# Download the model
import os

if USE_CACHE == False:
    os.system(f"curl -L {INPUT_MODEL_PATH} -o {OUTPUT_MODEL_PATH}")

In [3]:
# Construting model - do not touch here unless you know what you are doing.
from llama_cpp import Llama


llm = Llama(
      model_path='downloaded_model.gguf',
      n_gpu_layers=999, # Uncomment to use GPU acceleration
      # seed=1337, # Uncomment to set a specific seed
      n_ctx=32000, # Uncomment to increase the context window
      verbose=False,
)

ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes


In [41]:
# Helper functions to generate outputs
def generate(prompt, stream=False):
    response = llm(SYS_PROMPT.format(prompt=prompt),
               stream=stream,
               max_tokens=MAX_TOKENS_IN_RESPONSE,
               stop=["[END]"]
              )
    if stream:
        return response
    else:
        return response['choices'][0]['text']

def stream(prompt):
    for token in generate(prompt, stream=True):
        print(token['choices'][0]['text'], end='', flush=True)

In [42]:
# Testing the model
print(generate("tell me about diatetes"))

Diabetes is a group of metabolic diseases characterized by high blood sugar levels over an extended period of time. There are two major types of diabetes: Type 1 diabetes (insulin-dependent diabetes) and Type 2 diabetes (non-insulin-dependent diabetes). The most common form is Type 2 diabetes, which is also known as adult-onset diabetes due to its prevalence in adults. However, more children and adolescents are developing Type 2 diabetes than ever before [1].

Type 1 diabetes occurs when the pancreas does not produce enough insulin or produces no insulin at all. Insulin is an essential hormone that regulates blood sugar levels. In contrast, Type 2 diabetes occurs when the body does not use insulin properly (insulin resistance), leading to chronically high blood sugar levels [2].

Diabetes can cause various complications if not managed effectively. Some of the most severe complications include kidney damage (nephropathy), nerve damage (neuropathy), heart disease, stroke, and blindness (

In [92]:
stream("tell me about diatetes")

OUTPUT: Diabetes is a group of metabolic diseases characterized by high blood sugar levels over a prolonged period. Type 2 diabetes is the most common form of diabetes, accounting for approximately 90% of all diabetes cases worldwide. It is primarily caused by insulin resistance and impaired insulin secretion in the pancreas. Symptoms of type 2 diabetes include frequent urination, increased thirst, fatigue, and blurred vision. Early diagnosis and treatment are crucial to prevent complications such as heart disease, stroke, kidney disease, and neuropathy. Lifestyle changes, including diet modification, regular exercise, and maintaining a healthy weight, can help manage type 2 diabetes. However, in some cases, medication may be necessary to achieve blood sugar control. 

# Usage examples:

You can create your own Pubmed RSS and change the variable `pubmed_rss` below with your link. Restrict the Pubmed RSS to display only the 5 most recent articles, for better results

In [52]:
import re
import requests


pubmed_rss = "https://pubmed.ncbi.nlm.nih.gov/rss/search/1PQPGz2gzLgGzu9Ukv6Gc6AJfbcOKwnGqdz9iHxCjqhaIYdM-F/?limit=10&utm_campaign=pubmed-2&fc=20240306113136"

In [107]:
text = requests.get(pubmed_rss).text
text = re.sub(r'<[^>]+>', '', text).replace("  ", "").replace("\n\n", "\n")

In [111]:
# This takes a long time - I think it is due to the text size in the input
stream(text)


1. Introduction

Type 2 diabetes mellitus (T2DM) is a chronic metabolic disorder characterized by high blood sugar levels in the absence of insulin or insulin resistance [1]. The prevalence of T2DM has been increasing globally, with an estimated 10.5% of the world's population suffering from this disease by 2021 [2]. The progression of T2DM can be influenced by various factors, including genetic predisposition, lifestyle choices, and underlying health conditions [3]. One such factor is impaired ketone production, which can contribute to the development of T2DM in individuals with multiple acyl-CoA dehydrogenase deficiency (MADD). MADD is an inherited metabolic disorder caused by biallelic pathogenic variants in genes related to the flavoprotein complex [4]. The dysfunction of the complex leads to impaired fatty acid oxidation and ketone body production, which can cause hypoketotic hypoglycemia with prolonged fasting [5]. 


2. Background

MADD is caused by biallelic pathogenic variant

In [113]:
print(text)



type 2 diabetes
https://pubmed.ncbi.nlm.nih.gov/rss-feed/?feed_id=1PQPGz2gzLgGzu9Ukv6Gc6AJfbcOKwnGqdz9iHxCjqhaIYdM-F&amp;ff=20240312113414&amp;v=2.18.0.post9+e462414&amp;utm_campaign=pubmed-2&amp;utm_content=1PQPGz2gzLgGzu9Ukv6Gc6AJfbcOKwnGqdz9iHxCjqhaIYdM-F&amp;fc=20240306113136&amp;utm_medium=rss&amp;utm_source=Python Requests
type 2 diabetes: Latest results from PubMed
http://www.rssboard.org/rss-specification
PubMed RSS feeds (2.18.0.post9+e462414)
en
Tue, 12 Mar 2024 15:34:15 +0000
Tue, 12 Mar 2024 06:00:00 -0400
120
Variable effect of exercise training program on cardiovascular outcomes in high genetic risk individuals with diabetes
https://pubmed.ncbi.nlm.nih.gov/38469747/?utm_source=Python Requests&amp;utm_medium=rss&amp;utm_campaign=pubmed-2&amp;utm_content=1PQPGz2gzLgGzu9Ukv6Gc6AJfbcOKwnGqdz9iHxCjqhaIYdM-F&amp;fc=20240306113136&amp;ff=20240312113414&amp;v=2.18.0.post9+e462414
No abstract
Eur J Prev Cardiol. 2024 Mar 12:zwae106. doi: 10.1093/eurjpc/zwae106. Online ahead of p

In [None]:
# This takes a long time - I think it is due to the text size in the input
generate(text)