
# langchain-core

contains simple, core abstractions that have emerged as a standard, as well as LangChain Expression Language as a way to compose these components together. This package is now at version 0.1 and all breaking changes will be accompanied by a minor version bump.

# langchain-community

contains all third party integrations. We will work with partners on splitting key integrations out into standalone packages over the next month.

# langchain

contains higher-level and use-case specific chains, agents, and retrieval algorithms that are at the core of your application's cognitive architecture. We are targeting a launch of a stable 0.1 release for langchain in early January.


## Open in Collab

In [39]:
%pip install --upgrade --quiet requests==2.32.4 langchain langchain-community langchain-openai langchain-experimental langchain-neo4j langchain-text-splitters neo4j wikipedia tiktoken yfiles_jupyter_graphs

In [40]:
!pip install pydantic



In [41]:
from google.colab import userdata
OPENAI_API_KEY = userdata.get("OPENAI_API_KEY")

In [None]:
NEO4J_URI=userdata.get("NEO4J_URI")
NEO4J_USERNAME=userdata.get("NEO4J_USERNAME")
NEO4J_PASSWORD=userdata.get("NEO4J_PASSWORD")
NEO4J_DATABASE=userdata.get("NEO4J_DATABASE")
AURA_INSTANCEID=userdata.get("AURA_INSTANCEID")
AURA_INSTANCENAME=userdata.get("AURA_INSTANCENAME")

In [43]:
import os

os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
os.environ["NEO4J_URI"] = NEO4J_URI
os.environ["NEO4J_USERNAME"] = NEO4J_USERNAME
os.environ["NEO4J_PASSWORD"] = NEO4J_PASSWORD

In [45]:
from langchain_neo4j import Neo4jGraph
graph = Neo4jGraph()

In [46]:
from langchain_community.document_loaders import WikipediaLoader
raw_documents = WikipediaLoader(query="Elizabeth I").load()

len(raw_documents)



  lis = BeautifulSoup(html).find_all('li')


24

In [None]:
# raw_documents[:3]

In [47]:
from langchain_text_splitters import TokenTextSplitter

text_splitter = TokenTextSplitter(chunk_size=512, chunk_overlap=24)
documents = text_splitter.split_documents(raw_documents[:3])

In [48]:
from langchain_openai.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0, model="gpt-4o-2024-08-06")

In [49]:
from langchain_experimental.graph_transformers import LLMGraphTransformer

llm_transformer = LLMGraphTransformer(llm=llm)

In [50]:
graph_documents = llm_transformer.convert_to_graph_documents(documents)
graph_documents

[GraphDocument(nodes=[Node(id='Elizabeth I', type='Person', properties={}), Node(id='Henry Viii', type='Person', properties={}), Node(id='Anne Boleyn', type='Person', properties={}), Node(id='House Of Tudor', type='Organization', properties={}), Node(id='Elizabethan Era', type='Event', properties={}), Node(id='Edward Vi', type='Person', properties={}), Node(id='Lady Jane Grey', type='Person', properties={}), Node(id='Mary I', type='Person', properties={}), Node(id='William Cecil', type='Person', properties={}), Node(id='Baron Burghley', type='Title', properties={}), Node(id='Church Of England', type='Organization', properties={}), Node(id='James Vi Of Scotland', type='Person', properties={}), Node(id='Francis Walsingham', type='Person', properties={}), Node(id='England', type='Location', properties={}), Node(id='Ireland', type='Location', properties={}), Node(id='Elizabethan Religious Settlement', type='Event', properties={}), Node(id='English Protestant Church', type='Organization', p

In [51]:
graph.add_graph_documents(
    graph_documents,
    baseEntityLabel=True,
    include_source=True
)

In [52]:
# directly show the graph resulting from the given Cypher query
default_cypher = "MATCH (s)-[r:!MENTIONS]->(t) RETURN s,r,t LIMIT 50"

In [53]:
from yfiles_jupyter_graphs import GraphWidget
from neo4j import GraphDatabase

try:
  import google.colab
  from google.colab import output
  output.enable_custom_widget_manager()
except:
  pass

In [54]:
def showGraph(cypher: str = default_cypher):
    # create a neo4j session to run queries
    driver = GraphDatabase.driver(
        uri = os.environ["NEO4J_URI"],
        auth = (os.environ["NEO4J_USERNAME"],
                os.environ["NEO4J_PASSWORD"]))
    session = driver.session()
    widget = GraphWidget(graph = session.run(cypher).graph())
    widget.node_label_mapping = 'id'
    display(widget)
    return widget

showGraph()

GraphWidget(layout=Layout(height='800px', width='100%'))

GraphWidget(layout=Layout(height='800px', width='100%'))

In [55]:
from typing import List, Tuple, Optional
from langchain_community.vectorstores import Neo4jVector

In [56]:
from langchain_openai.embeddings import OpenAIEmbeddings


In [57]:
vector_index = Neo4jVector.from_existing_graph(
    OpenAIEmbeddings(),
    search_type="hybrid",
    node_label="Document",
    text_node_properties=["text"],
    embedding_node_property="embedding",
    # index_name="vector_768" # Use a different index name
)

In [58]:
graph.query("CREATE FULLTEXT INDEX entity IF NOT EXISTS FOR (e:__Entity__) ON EACH [e.id] ")

[]

In [59]:
from pydantic import BaseModel, Field
# Extract entities from text
class Entities(BaseModel):
    """Identifying information about entities."""

    names: List[str] = Field(
        ...,
        description="All the person, organization, or business entities that "
        "appear in the text",
    )

In [60]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts.prompt import PromptTemplate

In [61]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are extracting organization and person entities from the text.",
        ),
        (
            "human",
            "Use the given format to extract information from the following "
            "input: {question}",
        ),
    ]
)

In [62]:
entity_chain = prompt | llm.with_structured_output(Entities)

In [63]:
entity_chain.invoke({"question": "Where was Amelia Earhart born?"}).names

['Amelia Earhart']

In [64]:
from langchain_community.vectorstores.neo4j_vector import remove_lucene_chars

In [65]:
def generate_full_text_query(input: str) -> str:
    full_text_query = ""
    words = [el for el in remove_lucene_chars(input).split() if el]
    for word in words[:-1]:
        full_text_query += f" {word}~2 AND"
    full_text_query += f" {words[-1]}~2"
    return full_text_query.strip()

In [68]:
# Fulltext index query
def structured_retriever(question: str) -> str:
    result = ""
    entities = entity_chain.invoke({"question": question})
    for entity in entities.names:
        response = graph.query(
            """CALL db.index.fulltext.queryNodes('keyword', $query, {limit:2})
                YIELD node,score
                CALL {
                  WITH node // Explicitly bring 'node' into the subquery scope
                  MATCH (node)-[r:!MENTIONS]->(neighbor)
                  RETURN node.id + ' - ' + type(r) + ' -> ' + neighbor.id AS output
                  UNION ALL
                  WITH node // Explicitly bring 'node' into the subquery scope
                  MATCH (node)<-[r:!MENTIONS]-(neighbor)
                  RETURN neighbor.id + ' - ' + type(r) + ' -> ' +  node.id AS output
                }
                RETURN output LIMIT 50
            """,
            {"query": generate_full_text_query(entity)},
        )
        result += "\n".join([el['output'] for el in response])
    return result

In [69]:
print(structured_retriever("Who is Elizabeth I?"))






In [None]:
def retriever(question: str):
    print(f"Search query: {question}")
    structured_data = structured_retriever(question)
    unstructured_data = [el.page_content for el in vector_index.similarity_search(question)]
    # print(f"Unstructured data: {unstructured_data}")

    final_data = f"""Structured data:
{structured_data}
Unstructured data:
{"#Document ". join(unstructured_data)}
    """
    return final_data

In [72]:
_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question,
in its original language.
Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""

CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)

In [73]:
from langchain_core.runnables import (
    RunnableBranch,
    RunnableLambda,
    RunnableParallel,
    RunnablePassthrough,
)

from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import ConfigurableField

In [74]:
def _format_chat_history(chat_history: List[Tuple[str, str]]) -> List:
    buffer = []
    for human, ai in chat_history:
        buffer.append(HumanMessage(content=human))
        buffer.append(AIMessage(content=ai))
    return buffer

_search_query = RunnableBranch(
    # If input includes chat_history, we condense it with the follow-up question
    (
        RunnableLambda(lambda x: bool(x.get("chat_history"))).with_config(
            run_name="HasChatHistoryCheck"
        ),  # Condense follow-up question and chat into a standalone_question
        RunnablePassthrough.assign(
            chat_history=lambda x: _format_chat_history(x["chat_history"])
        )
        | CONDENSE_QUESTION_PROMPT
        | ChatOpenAI(temperature=0)
        | StrOutputParser(),
    ),
    # Else, we have no chat history, so just pass through the question
    RunnableLambda(lambda x : x["question"]),
)

template = """Answer the question based only on the following context:
{context}

Question: {question}
Use natural language and be concise.
Answer:"""

In [75]:
prompt = ChatPromptTemplate.from_template(template)

chain = (
    RunnableParallel(
        {
            "context": _search_query | retriever,
            "question": RunnablePassthrough(),
        }
    )
    | prompt
    | llm
    | StrOutputParser()
)

In [76]:
chain.invoke({"question": "Which house did Elizabeth I belong to?"})

Search query: Which house did Elizabeth I belong to?




Unstructured data: ['\ntext: Elizabeth I (7 September 1533 – 24 March 1603) was the Queen of England and Ireland from 17 November 1558 until her death. She was the last and longest reigning monarch of the House of Tudor. Her eventful reign, and its effect on history and culture, gave name to the Elizabethan era.\nElizabeth was the only surviving child of Henry VIII and his second wife, Anne Boleyn. When Elizabeth was two years old, her parents\' marriage was annulled, her mother was executed, and Elizabeth was declared illegitimate. Henry restored her to the line of succession when she was 10. After Henry\'s death in 1547, Elizabeth\'s younger half-brother Edward VI ruled until his own death in 1553, bequeathing the crown to a Protestant cousin, Lady Jane Grey, and ignoring the claims of his two half-sisters, Mary and Elizabeth, despite statutes to the contrary. Edward\'s will was quickly set aside and the Catholic Mary became queen, deposing Jane. During Mary\'s reign, Elizabeth was i

'Elizabeth I belonged to the House of Tudor.'

In [77]:
chain.invoke(
    {
        "question": "When was she born?",
        "chat_history": [("Which house did Elizabeth I belong to?", "House Of Tudor")],
    }
)

Search query: When was Elizabeth I born?




Unstructured data: ["\ntext:  military campaigns in the Netherlands, France, and Ireland. By the mid-1580s, England could no longer avoid war with Spain.\nAs she grew older, Elizabeth became celebrated for her virginity. A cult of personality grew around her which was celebrated in the portraits, pageants, and literature of the day. The Elizabethan era is famous for the flourishing of English drama, led by playwrights such as William Shakespeare and Christopher Marlowe, the prowess of English maritime adventurers, such as Francis Drake and Walter Raleigh, and for the defeat of the Spanish Armada. Some historians depict Elizabeth as a short-tempered, sometimes indecisive ruler, who enjoyed more than her fair share of luck. Towards the end of her reign, a series of economic and military problems weakened her popularity. Elizabeth is acknowledged as a charismatic performer and a dogged survivor in an era when government was ramshackle and limited, and when monarchs in neighbouring countri

'Elizabeth I was born on 7 September 1533.'