Выбор большой языковой модели

In [1]:
from llamba.chatmodels.ollama import OllamaModel
chatbot = OllamaModel(url="http://127.0.0.1:11434/", 
                      endpoint="api/generate", 
                      model="llama3", 
                      num_threads=1, 
                      check_connection_timeout=15, 
                      request_timeout=15) # опционально, параметр model может быть иным -- см. поддерживаемые модели Ollama
connection = chatbot.check_connection()
print(connection)

True


Подготовка данных

In [2]:
import torch
from torch import nn
import pandas as pd
import numpy as np

from llamba_library.bioage_model import BioAgeModel
from llamba_library.functions import get_shap_dict
from txai_omics_3.models.tabular.widedeep.ft_transformer import WDFTTransformerModel, FN_SHAP, FN_CHECKPOINT


#### Данные

my_data = {'CXCL9': 2599.629474, 
           'CCL22': 820.306524, 
           'IL6': 0.846377, 
           'PDGFB': 13400.666359, 
           'CD40LG': 1853.847406, 
           'IL27': 1128.886982,
           'VEGFA': 153.574220,
           'CSF1': 239.627236,
           'PDGFA': 1005.844290,
           'CXCL10': 228.229829,
           'Age': 90.454972 }

data = pd.DataFrame(my_data, index=[0])


#### Модель
    
fn_model = FN_CHECKPOINT
model = WDFTTransformerModel.load_from_checkpoint(fn_model)
bioage_model = BioAgeModel(model=model)

def predict_func(x):
    batch = {
        'all': torch.from_numpy(np.float32(x)),
        'continuous': torch.from_numpy(np.float32(x)),
        'categorical': torch.from_numpy(np.int32(x[:, []])),
    }
    return model(batch).cpu().detach().numpy()
shap_dict = get_shap_dict(FN_SHAP)
explainer = shap_dict['explainer']
feats = data.drop(['Age'], axis=1).columns.to_list()

top_n = 3 # количество признаков с наибольшим вкладом

top_shap = bioage_model.get_top_shap(top_n, data, feats, shap_dict) 

h:\Lobachevsky\llamba\llamba_env\Lib\site-packages\pytorch_lightning\utilities\migration\utils.py:55: The loaded checkpoint was produced with Lightning v2.4.0, which is newer than your current Lightning version: v2.1.4


Составление запроса для экспертной системы 

In [3]:
from llamba.connector import LlambaConnector

connector = LlambaConnector(bioage_model=bioage_model, chat_model=chatbot)
prompts = connector.generate_prompts(top_n=top_n, 
                                     data=top_shap['data'], 
                                     feats=top_shap['feats'], 
                                     values=top_shap['values']) # n - количество признаков, внесших наибольший вклад
print("Prompts: ")
for prompt in prompts:
    print(prompt)

Prompts: 
What is PDGFB? What does an increased level of PDGFB mean?
What is CD40LG? What does an increased level of CD40LG mean?
What is CXCL9? What does an increased level of CXCL9 mean?


Передача экспертной системе данных 

In [4]:
res = connector.query_prompts()
print("Analysis: \n")
print(res)

Analysis: 

PDGFB: 13400.666359
CD40LG: 1853.847406
CXCL9: 2599.629474
PDGFB stands for Platelet-derived growth factor beta. It is a type of protein that plays a role in cell signaling and is involved in various cellular processes, including proliferation, differentiation, and survival.

An increased level of PDGFB has been associated with several age-related diseases and conditions, including atherosclerosis, hypertension, and cardiovascular disease. Elevated levels of PDGFB may also be indicative of certain types of cancer, such as melanoma and glioblastoma.

In the context of aging, increased levels of PDGFB have been linked to age-related changes in the vasculature, including vessel wall thickening and stiffening. This can contribute to reduced blood flow and oxygen delivery to tissues, which may exacerbate age-related declines in physical function and cognitive impairment.

CD40LG (also known as TNFSF5) is a type I transmembrane glycoprotein that belongs to the tumor necrosis fact