# 20240112 chatAgent accessing google scholar
## Langchain Test
LangChain is a framework for developing applications powered by language models. Try to build a simple LLM based on that to test whether it is suitable for the further work.

I installed the langchain python package and run it on the Jupyter Notebook. There is a good tutorial on their website that teahcing us how to [get started](https://python.langchain.com/docs/expression_language/get_started). I initialized the model and used my OpenAI API key to access the API to ask the llm my first question before fine-tuning "how can programmable photonic circuits help optical computing?"


In [4]:
##install langchain to run##
#pip install langchain
#pip install langchain_openai

##API Key##
#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


#export OPENAI_API_KEY="..."

##export Open AI API Key#
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(openai_api_key="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")

##ask the llm a question before any fine-tuning##
llm.invoke("how can programmable photonic circuits help optical computing?")

AIMessage(content='Programmable photonic circuits can help optical computing in several ways:\n\n1. Flexibility: Programmable photonic circuits allow for the reconfiguration of optical components and connections, enabling flexibility in designing and implementing various computational tasks. This flexibility can improve the efficiency and performance of optical computing systems.\n\n2. Adaptability: Optical computing relies on the manipulation and control of light waves to perform computations. Programmable photonic circuits can dynamically adapt to changing computational requirements, allowing for real-time adjustments and optimizations based on the specific task at hand.\n\n3. Parallel processing: Photonic circuits can process multiple signals simultaneously, taking advantage of the parallel nature of light waves. Programmability enables the routing and manipulation of multiple optical signals in parallel, leading to faster and more efficient computations.\n\n4. Complex computations:

We can also guide it's response with a prompt template and add a simple output parser to convert the chat message to a string in langchain. And combine everything together to ask the same question.

In [5]:
##guide it's response with a prompt template##
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are world class technical documentation writer."),
    ("user", "{input}")
])

##combine these into a simple LLM chain##
#chain = prompt | llm 

#ask the same question again#
#chain.invoke({"input": "how can programmable photonic circuits help optical computing?"})

##add a simple output parser to convert the chat message to a string##
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

##combine##
chain = prompt | llm | output_parser

##ask the same question again##
chain.invoke({"input": "how can programmable photonic circuits help optical computing?"})

'Programmable photonic circuits play a crucial role in advancing optical computing, which aims to use light instead of electrical signals for computation. These circuits allow for the manipulation and control of light signals, enabling various functionalities required for optical computing systems. Here are some specific ways programmable photonic circuits help in optical computing:\n\n1. Light signal routing: Programmable photonic circuits can route light signals through different paths within the circuit, similar to how electrical signals are routed in electronic circuits. This routing capability allows for flexible and dynamic connectivity between different components of an optical computing system.\n\n2. Optical signal processing: Optical computing relies on manipulating light signals to perform computations. Programmable photonic circuits offer the ability to process optical signals, such as modulating the intensity, phase, or polarization of light. This allows for tasks like filt

## Retrieval Chain
In order to properly answer the original question, we need to provide additional context to the LLM. We can do this via retrieval. First, we need to load the data that we want to index:
[Programmable photonic circuits](https://www.nature.com/articles/s41586-020-2764-0)

Installed required package and builded our own index. Now that we have this data indexed in a vectorstore, we will create a retrieval chain. This chain will take an incoming question, look up relevant documents, then pass those documents along with the original question into an LLM and ask it to answer the original question.

First, let's set up the chain that takes a question and the retrieved documents and generates an answer. The documents to first come from the retriever we just set up. That way, for a given question we can use the retriever to dynamically select the most relevant documents and pass those in.

Now, invoke this chain to get a new answer based on the document from the retriever. This answer looks more accurate, detailed and concise. In other words, this answer is less ChatGPT style, which is a good sign!


In [6]:
##load the data##
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://www.nature.com/articles/s41586-020-2764-0")

docs = loader.load()

##index it in the vectorstore##
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(openai_api_key="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")

from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter


text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)
vector = FAISS.from_documents(documents, embeddings)

from langchain.chains.combine_documents import create_stuff_documents_chain

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

document_chain = create_stuff_documents_chain(llm, prompt)

from langchain.chains import create_retrieval_chain

retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

response = retrieval_chain.invoke({"input": "how can programmable photonic circuits help optical computing?"})
print(response["answer"])

Programmable photonic circuits can help optical computing by providing a higher-level platform for prototyping novel optical functionalities without the need for custom chip fabrication. These circuits can be designed and programmed using software, allowing for a wide variety of functions to be implemented through on-chip waveguides, tunable beam couplers, and optical phase shifters. This flexibility enables the development of complex optical operations and algorithms, making optical computing more versatile and efficient.


## Conversation Retrieval
Still will use the create_retrieval_chain function, but now should not only work on the recent input. We should take the whole history into account.

Recent input→input
Conversation history→chat_history
+LLM→generate a search query

Create a new chain and ask a follow up question: Can programmable photonics circuits help in-memory optical computing? It returns documents about programmable photonics circuits, because the LLM generated a new query, combining the chat history with the follow up question.

Now that we have this new retriever, we can create a new chain to continue the conversation with these retrieved documents in mind.

Test to see the model can answer questions based on the chat history, making it more like a chatbot.

In [7]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

# we need a prompt that we can pass into an LLM to generate this search query

prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation")
])
retriever_chain = create_history_aware_retriever(llm, retriever, prompt)

from langchain_core.messages import HumanMessage, AIMessage

chat_history = [HumanMessage(content="Can programmable photonics circuits help in-memory optical computing?"), AIMessage(content="Yes!")]
# retriever_chain.invoke({
#     "chat_history": chat_history,
#     "input": "Tell me how"
# })


prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer the user's questions based on the below context:\n\n{context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
])
document_chain = create_stuff_documents_chain(llm, prompt)

retrieval_chain = create_retrieval_chain(retriever_chain, document_chain)

chat_history = [HumanMessage(content="Can programmable photonics circuits help in-memory optical computing?"), AIMessage(content="Yes!")]
response = retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})

print(response["answer"])

Programmable photonic circuits have the potential to play a significant role in in-memory optical computing. In-memory computing refers to the concept of performing computational tasks directly within the memory storage units, eliminating the need for data transfers between memory and processing units. 

Optical computing, on the other hand, utilizes light instead of electrical signals to perform computations, which offers advantages such as high-speed processing, low power consumption, and parallelism. 

Programmable photonic circuits can enable the implementation of complex computational operations directly within the memory units using light. By integrating photonic components such as waveguides, beam couplers, and optical phase shifters, these circuits can be programmed to perform various functions and operations on the optical signals.

For example, programmable photonic circuits can be used to implement matrix operations, which are fundamental in many computational tasks. By util

## Agent with searching tool
The agent means that the LLM should decide what steps to take. So, it should have access to some necessary tools, such like the retrieval we just created and a search tool which is used to get some up-to-date information.

For a search tool, try Tavily here. (a search engine built specifically for AI agents (LLMs))
[Tavily](https://tavily.com/)

Now we can invoke the agent to see how it responds.

1. Same question about photonics: agent_executor.invoke({"input": "how can programmable photonic circuits help optical computing?"})
   
2. Ask a question that requires real-time information: agent_executor.invoke({"input": "what is the weather in Halle in Germany?"})
   
3. Ask a question which needs google scholar: agent_executor.invoke({"input": "Can you help me find a recent article about phase change material through Google scholar?"})

In [8]:
from langchain.tools.retriever import create_retriever_tool

retriever_tool = create_retriever_tool(
    retriever,
    "photonics_search",
    "Search for information about photonics. For any questions about photonics, you must use this tool!",
)

#pip install -U langchain-community tavily-python

import getpass
import os

os.environ["TAVILY_API_KEY"] = "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"


from langchain_community.tools.tavily_search import TavilySearchResults

search = TavilySearchResults()

tools = [retriever_tool, search]

#pip install langchainhub

from langchain_openai import ChatOpenAI
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain.agents import AgentExecutor

# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0,openai_api_key="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")
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "how can programmable photonic circuits help optical computing?"})
#agent_executor.invoke({"input": "what is the weather in Halle in Germany?"})
#agent_executor.invoke({"input": "What is the earliest flight from Frankfurt to Shanghai?"})
#agent_executor.invoke({"input": "Can you help me find a recent article about phase change material through Google scholar?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mProgrammable photonic circuits can play a crucial role in advancing optical computing by providing flexibility and reconfigurability in the design and operation of optical systems. Here are a few ways in which programmable photonic circuits can help optical computing:

1. Reconfigurability: Programmable photonic circuits allow for the dynamic reconfiguration of optical components and connections. This flexibility enables the implementation of different computational functions and algorithms, making it easier to adapt to changing computing requirements.

2. Increased functionality: By using programmable photonic circuits, it is possible to integrate multiple optical components, such as waveguides, modulators, detectors, and switches, onto a single chip. This integration enables the creation of complex optical systems with enhanced functionality, including logic operations, signal processing, and data manipulation.

3. Improved

{'input': 'how can programmable photonic circuits help optical computing?',
 'output': "Programmable photonic circuits can play a crucial role in advancing optical computing by providing flexibility and reconfigurability in the design and operation of optical systems. Here are a few ways in which programmable photonic circuits can help optical computing:\n\n1. Reconfigurability: Programmable photonic circuits allow for the dynamic reconfiguration of optical components and connections. This flexibility enables the implementation of different computational functions and algorithms, making it easier to adapt to changing computing requirements.\n\n2. Increased functionality: By using programmable photonic circuits, it is possible to integrate multiple optical components, such as waveguides, modulators, detectors, and switches, onto a single chip. This integration enables the creation of complex optical systems with enhanced functionality, including logic operations, signal processing, and 

In [9]:
agent_executor.invoke({"input": "what is the weather in Halle in Germany?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `tavily_search_results_json` with `{'query': 'weather in Halle, Germany'}`


[0m[33;1m[1;3m[{'url': 'https://www.weather25.com/europe/germany/saxony-anhalt/halle', 'content': "will give you an idea of weather trends in Halle. For example, the weather in Halle in January 2024.  The weather in Halle, Germany Halle weather by months Halle weather Halle 14 day weather The weather today in Halle  Weather25.com provides all the information that you need to know about the weather in Halle, Germany. Stay up to date  in Halle including humidity, wind, chance of rain and more on the Halle current weather page.° F Sunrise: Sunset: Miles % % Halle weather report Looking for Halle weather updates? You've come to the right place! The current temperature in Halle is ° F. You can find more detailed information about the weather in Halle including humidity, wind, chance of rain and more on the Halle current weather page. The wea

{'input': 'what is the weather in Halle in Germany?',
 'output': "I'm sorry, but I couldn't find the current weather in Halle, Germany. However, you can visit [Weather25.com](https://www.weather25.com/europe/germany/saxony-anhalt/halle) for more information about the weather in Halle."}

In [10]:
agent_executor.invoke({"input": "Can you help me find a recent article about phase change material through Google scholar?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `tavily_search_results_json` with `{'query': 'phase change material site:scholar.google.com'}`


[0m[33;1m[1;3m[][0m[32;1m[1;3mI'm sorry, but I couldn't find any recent articles about phase change material through Google Scholar. Is there anything else I can help you with?[0m

[1m> Finished chain.[0m


{'input': 'Can you help me find a recent article about phase change material through Google scholar?',
 'output': "I'm sorry, but I couldn't find any recent articles about phase change material through Google Scholar. Is there anything else I can help you with?"}

## Access Google Scholar
Since Tavily cannot access the google scholar to give some reference after our request, it is needed to find another alternative method or API to access this search engine. 

In the LangChain community, there is a tool to access the google scholar by using the serp api which is free for 1000 requests a month.

[langchain_goole_scholar](https://python.langchain.com/docs/integrations/tools/google_scholar)

[google_schlar_api](https://serpapi.com/google-scholar-api)

After importing the tools and getting a api key and adding this google scholar access tool into the tools we defined before, the model now can answer the same question we asked before: agent_executor.invoke({"input": "Can you help me find a recent article about phase change material through Google scholar?"})


In [11]:
from langchain.tools.retriever import create_retriever_tool

retriever_tool = create_retriever_tool(
    retriever,
    "photonics_search",
    "Search for information about photonics. For any questions about photonics, you must use this tool!",
)

#pip install -U langchain-community tavily-python

import getpass
import os

os.environ["TAVILY_API_KEY"] = "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"



from langchain_community.tools.google_scholar import GoogleScholarQueryRun
from langchain_community.utilities.google_scholar import GoogleScholarAPIWrapper

os.environ["SERP_API_KEY"] = "445ef0241382247232dc2e05e155efd2ab743bf2728bd850b09ff34a6ea75c63"
scholar = GoogleScholarQueryRun(api_wrapper=GoogleScholarAPIWrapper())
#scholar.run("Phase Change Material")

from langchain_community.tools.tavily_search import TavilySearchResults

search = TavilySearchResults()

tools = [retriever_tool, search, scholar]

#pip install langchainhub

from langchain_openai import ChatOpenAI
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain.agents import AgentExecutor

# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0,openai_api_key="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")
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "Can you help me find a recent article about phase change material through Google scholar?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `google_scholar` with `recent article about phase change material`


[0m[38;5;200m[1;3mTitle: Phase change materials
Authors: 
Summary: S Raoux - Annual Review of Materials Research, 2009 - annualreviews.org
Total-Citations: 660

Title: A review on phase change material application in building
Authors: 
Summary: Y Cui, J Xie, J Liu, J Wang… - Advances in Mechanical …, 2017 - journals.sagepub.com
Total-Citations: 171

Title: A review on phase-change materials: Mathematical modeling and simulations
Authors: Y Dutil,DR Rousse,NB Salah
Summary: Y Dutil, DR Rousse, NB Salah, S Lassue… - … and sustainable Energy …, 2011 - Elsevier
Total-Citations: 785

Title: Phase change materials and phase change memory
Authors: F Xiong,M Wuttig,E Pop
Summary: S Raoux, F Xiong, M Wuttig, E Pop - MRS bulletin, 2014 - cambridge.org
Total-Citations: 557

Title: A review on effect of phase change material encapsulation on the thermal pe

{'input': 'Can you help me find a recent article about phase change material through Google scholar?',
 'output': 'Here are some recent articles about phase change materials:\n\n1. Title: Phase change materials\n   Authors: S Raoux\n   Summary: Annual Review of Materials Research, 2009\n   Total Citations: 660\n\n2. Title: A review on phase change material application in building\n   Authors: Y Cui, J Xie, J Liu, J Wang\n   Summary: Advances in Mechanical Engineering, 2017\n   Total Citations: 171\n\n3. Title: A review on phase-change materials: Mathematical modeling and simulations\n   Authors: Y Dutil, DR Rousse, NB Salah\n   Summary: Energy and Buildings, 2011\n   Total Citations: 785\n\n4. Title: Phase change materials and phase change memory\n   Authors: F Xiong, M Wuttig, E Pop\n   Summary: MRS Bulletin, 2014\n   Total Citations: 557\n\n5. Title: A review on effect of phase change material encapsulation on the thermal performance of a system\n   Authors: PB Salunkhe, PS Shembekar

## Summary
Here, this is a basic agent based on the OpenAI models which can answer the photonics questions based on the retrieval we created, answer some questions that need real-time information based on the search tool Tavily and help find some paper via google scholar search engine.

After trying to build a LLM with LangChain, first of all, I think it may be a useful tool for us to start some work in the human-computer interaction part. In that part, the model needs to collect users' questions and needs, and after thinking about it, the model should clarify some details through further communication with users, which is more conducive to the accuracy of subsequent service script generation. The LangChain with the use of OpenAI API can provide a basic chatbot with the remaining conversation history function.

Another thing that LangChain can help with, I think, is the building of our knowledge graph. With some embedding function, we can let the model ingest some paper or other useful information we want it to learn. And use the retrieval chain to answer the question by combining knowledge together.