In [35]:
import time
import pandas as pd
import numpy as np
from IPython.display import display, HTML
from pymilvus import (
    connections,
    utility,
    FieldSchema, CollectionSchema, DataType,
    Collection,
)

fmt = "\n=== {:30} ===\n"
search_latency_fmt = "search latency = {:.4f}s"
num_entities, dim = 3000, 8

#################################################################################
# 1. connect to Milvus
# Add a new connection alias `default` for Milvus server in `localhost:19530`
# Actually the "default" alias is a buildin in PyMilvus.
# If the address of Milvus is the same as `localhost:19530`, you can omit all
# parameters and call the method as: `connections.connect()`.
#
# Note: the `using` parameter of the following methods is default to "default".
print(fmt.format("start connecting to Milvus"))
connections.connect("default", host="localhost", port="19530")


test = Collection("test",consistency_level="Strong")
test.load()


=== start connecting to Milvus     ===



In [5]:
import torch
from transformers import AutoTokenizer, AutoModel

# Load pre-trained BERT model and tokenizer
bert_model_name = 'bert-base-uncased'
tokenizer = AutoTokenizer.from_pretrained(bert_model_name)
bert_model = AutoModel.from_pretrained(bert_model_name)

def embed_text_with_bert(text, tokenizer=tokenizer, bert_model=bert_model):
    # Tokenize and embed text using BERT
    tokenized_text = tokenizer(text, return_tensors='pt', truncation=True, padding=True)
    with torch.no_grad():
        model_output = bert_model(**tokenized_text)
        embedding = model_output.last_hidden_state.mean(dim=1)

    return embedding.squeeze().tolist()

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [27]:
def retrieve_relevant_chunks(query, limit=10, collection=test, tokenizer=tokenizer, bert_model=bert_model):
    return_df = pd.DataFrame(columns=['id', 'metric', 'title', 'chunk'])
    vector_to_search = embed_text_with_bert(query)
    search_params = {
        "metric_type": "COSINE"
    }
    result = test.search([vector_to_search], "chunk_embedded", search_params, limit=limit, output_fields=["title", 'chunk'])
    for hits in result:
        for hit in hits:
            return_df = return_df.append({'id':hit.id, 'metric':hit.distance, 'title':hit.entity.get('title'), 'chunk':hit.entity.get('chunk')}, ignore_index=True)
    return return_df


In [37]:
result = retrieve_relevant_chunks('How does emotional intelligence affect student lives?', limit=5)
display(HTML(result.to_html()))

Unnamed: 0,id,metric,title,chunk
0,379736804,0.745491,Students' Foreign Language Learning Adaptability and Mental Health Supported by Artificial Intelligence.,"Psychological health problems include learning anxiety, loneliness, depression, and inferiority in college students' foreign language learning. These negative emotions, to a certain extent, affect the learning effect of college students' foreign language learning. This paper is of great significance to the adaptability of college students' foreign language learning to the intelligent environment and the analysis of their mental health problems"
1,379829576,0.721002,Trait emotional intelligence and resilience: gender differences among university students.,"For prevention of mental disorders and to foster wellbeing, it might be helpful to focus on improvement of self-perception in girls and women, and on supporting emotional awareness towards other people's emotions in boys and men. Further studies in the field should address other populations"
2,379741086,0.720428,"The influence of emotional intelligence on academic stress among medical students in Neyshabur, Iran.","Intervention based on emotional intelligence significantly (p < 0. 05) improved students' emotional intelligence skills and decreased their academic stress and reactions to stressors in the intervention group. CONCLUSION: It appears that emotional intelligence training is a feasible and highly acceptable way to develop coping skills with academic stress; therefore, such training is essential to be considered as part of university education to improve students' education quality and their skills to study without academic stress"
3,379736805,0.717203,Students' Foreign Language Learning Adaptability and Mental Health Supported by Artificial Intelligence.,This paper hopes to provide data reference for the research on improving college students' foreign language learning effects
4,379829571,0.715925,Trait emotional intelligence and resilience: gender differences among university students.,"BACKGROUND: Previous studies have reported strong correlations of emotional intelligence (EI) with mental health and wellbeing; it is also a powerful predictor of social functioning and personal adaption. Resilience is the ability to adapt to significant life stressors and is also crucial for maintaining and restoring physical and mental health. The aim of this study was to investigate EI and resilience in healthy university students, with a focus on gender differences in EI and resilience components"


In [38]:
result = retrieve_relevant_chunks('What is used in brain cancer imaging?', limit=5)
display(HTML(result.to_html()))

Unnamed: 0,id,metric,title,chunk
0,379751973,0.762949,Covalent drugs based on small molecules and peptides for disease theranostics.,"Finally, we discuss the application of covalent peptide drugs and expect to provide a new reference for cancer treatment"
1,379770544,0.761487,Research related to the diagnosis of prostate cancer based on machine learning medical images: A review.,"DISCUSSION: Machine learning and deep learning combined with medical imaging have a broad application prospect for the diagnosis and staging of prostate cancer, but the research in this area still has more room for development"
2,379760861,0.745016,Artificial Intelligence-Based Methods for Integrating Local and Global Features for Brain Cancer Imaging: Scoping Review.,BACKGROUND: Transformer-based models are gaining popularity in medical imaging and cancer imaging applications. Many recent studies have demonstrated the use of transformer-based models for brain cancer imaging applications such as diagnosis and tumor segmentation. OBJECTIVE: This study aims to review how different vision transformers (ViTs) contributed to advancing brain cancer diagnosis and tumor segmentation using brain image data
3,379825021,0.730981,Artificial Intelligence in Melanoma Dermatopathology: A Review of Literature.,"Pathology serves as a promising field to integrate artificial intelligence into clinical practice as a powerful screening tool. Melanoma is a common skin cancer with high mortality and morbidity, requiring timely and accurate histopathologic diagnosis. This review explores applications of artificial intelligence in melanoma dermatopathology, including differential diagnostics, prognosis prediction, and personalized medicine decision-making"
4,379775502,0.727216,Detection of large-droplet macrovesicular steatosis in donor Livers based on segment-anything model.,"Artificial intelligence (AI) models, such as segmentation and detection models, are being developed to detect LDF cells. The Segment-Anything Model, utilizing the DETR architecture, has the ability to segment objects without prior knowledge of size or shape. We investigated the Segment-Anything Model's potential to detect LDF hepatocytes in liver biopsies"
