In [1]:
MODEL = "mistral"

In [2]:
%pip install docarray langchain_community langchain pypdf

Collecting docarrayNote: you may need to restart the kernel to use updated packages.

  Using cached docarray-0.40.0-py3-none-any.whl.metadata (36 kB)
Collecting langchain_community
  Downloading langchain_community-0.2.1-py3-none-any.whl.metadata (8.9 kB)
Collecting langchain
  Downloading langchain-0.2.1-py3-none-any.whl.metadata (13 kB)
Collecting numpy>=1.17.3 (from docarray)
  Using cached numpy-1.26.4-cp312-cp312-win_amd64.whl.metadata (61 kB)
Collecting orjson>=3.8.2 (from docarray)
  Downloading orjson-3.10.3-cp312-none-win_amd64.whl.metadata (50 kB)
     ---------------------------------------- 0.0/50.9 kB ? eta -:--:--
     -------------------------------- ------- 41.0/50.9 kB 1.9 MB/s eta 0:00:01
     ---------------------------------------- 50.9/50.9 kB 1.3 MB/s eta 0:00:00
Collecting pydantic>=1.10.8 (from docarray)
  Downloading pydantic-2.7.2-py3-none-any.whl.metadata (108 kB)
     ---------------------------------------- 0.0/108.5 kB ? eta -:--:--
     ----------- -----

In [3]:
from langchain_community.llms import Ollama
from langchain_community.embeddings import OllamaEmbeddings

model = Ollama(model=MODEL)
embeddings = OllamaEmbeddings(model=MODEL)

model.invoke("Tell me a joke")

" Why don't scientists trust atoms?\n\nBecause they make up everything!"

In [4]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = model | parser 
chain.invoke("Tell me a joke")

" Why don't scientists trust atoms?\n\nBecause they make up everything!"

In [5]:
from langchain.prompts import PromptTemplate

template = """
Answer the question based on the context below. If you can't 
answer the question, reply "I don't know".

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
prompt.format(context="Here is some context", question="Here is a question")

'\nAnswer the question based on the context below. If you can\'t \nanswer the question, reply "I don\'t know".\n\nContext: Here is some context\n\nQuestion: Here is a question\n'

In [6]:
chain = prompt | model | parser

chain.invoke({"context": "My parents named me Santiago", "question": "What's your name'?"})

' My name is Santiago.'

In [8]:
from langchain_community.document_loaders import PyPDFDirectoryLoader

loader = PyPDFDirectoryLoader("pdfs")
data = loader.load_and_split()
data

[Document(page_content="Machine learning-driven new material discovery\nJiazhen Cai,aXuan Chu,aKun Xu,aHongbo Liband Jing Wei *ab\nNew materials can bring about tremendous progress in technology and applications. However, the\ncommonly used trial-and-error method cannot meet the current need for new materials. Now, a newlyproposed idea of using machine learning to explore new materials is becoming popular. In this paper, wereview this research paradigm of applying machine learning in material discovery, including datapreprocessing, feature engineering, machine learning algorithms and cross-validation procedures.Furthermore, we propose to assist traditional DFT calculations with machine learning for materialdiscovery. Many experiments and literature reports have shown the great e ﬀects and prospects of this\nidea. It is currently showing its potential and advantages in property prediction, material discovery,\ninverse design, corrosion detection and many other aspects of life.\n1. Intro

In [9]:
from langchain_community.vectorstores import DocArrayInMemorySearch

vectorstore = DocArrayInMemorySearch.from_documents(data, embedding=embeddings)



In [10]:
retriever = vectorstore.as_retriever()
retriever.invoke("Machine learning")

[Document(page_content='automatic encoder, sparse coding, restricted Boltzmann machine\n(RBM), deep belief networks (DBN), and recurrent neural\nnetworks (RNNs). Currently, deep learning is being widely usedin many \ue103elds, such as computer vision, image recognition and\nnatural language recognition. For example, convolutional neuralnetworks are used to detect corrosion in many facilities;\n83also,\nMaxim Signaevsky and his coworkers proposed the use of deeplearning algorithms to judge the accumulation of the abnormalprotein TAU to help diagnose neurodegenerative diseases.\n84Fig. 8\nshows how they extracted image patches for network training and\ntested the robustness and reliability of the network with na ¨ıve\nimages. Izhar Wallach used deep learning to predict the biolog-ical activity of small molecules in drug discovery.\n85Overall, as\na new machine learning method, deep learning has excellentdevelopment prospects.\nTable 2 summarises some basic algorithms used in material\nsc

In [11]:
from operator import itemgetter

chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
    }
    | prompt
    | model
    | parser
)

In [12]:
questions = [
    "What is recent discovery in material science with help of machine learning",
    "What is EfficientVitSAM",
    "How CREST differs from EfficientVit"
]

for question in questions:
    print(f"Question: {question}")
    print(f"Answer: {chain.invoke({'question': question})}")
    print()

Question: What is recent discovery in material science with help of machine learning
Answer:  There have been several recent discoveries in material science using machine learning techniques. One notable example is the discovery of new materials with desirable properties through machine learning algorithms. For instance, researchers have used deep learning models to predict the properties of new materials based on their atomic structures, leading to the discovery of new materials that exhibit high performance in various applications such as energy storage and electronics. Another application of machine learning in material science is the development of models for predicting the mechanical behavior of materials under different loading conditions, which can help in designing more efficient and cost-effective engineering structures. Additionally, machine learning algorithms have been used to optimize the design of nanomaterials with specific properties for applications such as catalysis a