#### <font color="green">Data Loaders</font>
- Load all kinds of data and then ask the LLM questions about it.
- Connect with data sources and load private documents.

#### <font color="green">LangChain built-in data loaders</font>
- Labeled as "integrations".
- Most of them require to install the corresponding libraries.

#### <font color="green">LangChain documentation on Document Loaders</font>
- See the documentation page <a href="">here.</a>
- See the list of built-in document loaders <a href="">here.</a>

In [None]:
# !pip install python-dotenv

In [127]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
groq_api_key = os.environ["GROQ_API_KEY"]

#### <font color="green">Install LangChain</font>

In [None]:
# !pip install langchain

<font color="green">Connect with an LLM</font>

In [None]:
# !pip install langchain-openai
# !pip install langchain-groq

###### <font color="blue">NOTE:</font> Since right now is the best LLM in the market, We will use OpenAI by default. You will see how to connect with other Open Source LLMs like Llama3 or Mistral Models.

##### <font color="green">Simple data loading</font>

##### <font color="blue">Loading a .txt file</font>

In [120]:
from langchain_groq import ChatGroq

chatModel = ChatGroq(model="llama3-70b-8192")

In [None]:
# !pip install langchain-community

In [75]:
from langchain_community.document_loaders import TextLoader

loader = TextLoader("./data/ageye-meta-data.txt")

loaded_data = loader.load()

In [76]:
loaded_data

[Document(metadata={'source': './data/ageye-meta-data.txt'}, page_content='What is our goal?\n\nEnsure that fresh, nutritious, and\nsustainable food is accessible to all.\n\nAs urbanization increases and space becomes a premium, indoor hydroponic farming is not just a choice; itâ€™s a necessity. And for such an important mission, of building an equitable food system, we need solutions that are smart, intuitive, and efficient.\n\nA Global Problem\n\nFacing the Future of Food: A Critical Juncture\n\nDemand Skyrockets, Supply Dwindles\n\nBy the year 2050, our world will be home to nearly 10 billion people, all needing sustenance. Yet, the capacity of our planet to provide is under unprecedented pressure. Valuable agricultural land is vanishing, swallowed by urban expansion, challenged by changing climates, and degraded by practices that fail to stand the test of sustainability.\n\nThe Cost of Conventional Farming\n\nThe way we grow food now demands too much water and leans heavily on chem

In [77]:
type(loaded_data)

list

##### <font color="blue">Loading a .CSV file</font>

In [80]:
from langchain_community.document_loaders import CSVLoader

loader = CSVLoader("./data/farm-data-logs.csv")

loaded_data = loader.load()

In [81]:
loaded_data

[Document(metadata={'source': './data/farm-data-logs.csv', 'row': 0}, page_content=': 0\nannotation_type: Box\ndate: 20-12-2021\ntypeOf_farm: RGB\ncam_height: \nplant_family: arugula\nplant_catergory: \nannotation_labels: leaf\ncondition: healthy\nprocess_count: 50\ntimestamp: 11:59:07'),
 Document(metadata={'source': './data/farm-data-logs.csv', 'row': 1}, page_content=': 1\nannotation_type: Box\ndate: 20-12-2021\ntypeOf_farm: RGB\ncam_height: \nplant_family: butter-head-green\nplant_catergory: \nannotation_labels: leaf\ncondition: healthy\nprocess_count: 64\ntimestamp: 15:45:39'),
 Document(metadata={'source': './data/farm-data-logs.csv', 'row': 2}, page_content=': 2\nannotation_type: Box\ndate: 20-12-2021\ntypeOf_farm: RGB\ncam_height: \nplant_family: butter-head-red\nplant_catergory: \nannotation_labels: leaf\ncondition: healthy\nprocess_count: 36\ntimestamp: 11:05:36'),
 Document(metadata={'source': './data/farm-data-logs.csv', 'row': 3}, page_content=': 3\nannotation_type: Box\nd

##### <font color="blue">Loading a .html file</font>

In [None]:
# !pip install bs4

In [82]:
from langchain_community.document_loaders import UnstructuredHTMLLoader

loader = UnstructuredHTMLLoader('./data/AGEYE-Truly-Intelligent-Farming-html.html')

loaded_data = loader.load()

In [83]:
loaded_data

[Document(metadata={'source': './data/AGEYE-Truly-Intelligent-Farming-html.html'}, page_content='Welcome, to the\n\nFuture\n\nof Indoor Farming.\n\nAGEYE Develops Next Generation Technologies\n\nfor Sustainable Food Production\n\nWe believe in making advanced technology available to every grower, no matter the scale of their operations, to enhance their productivity and sustainability. This initiative is a step towards our larger objective of establishing vertical farming as a practical and profitable way to produce crops throughout the year.\n\nour Approach\n\nHelping Indoor Farmers do M.O.R.E with Less\n\nMonitor Every Moment Of Every Plant, 24×7\n\nUtilizing advanced sensors and AI-driven analytics, AGEYE enables farmers to keep a constant eye on their crops, identifying issues like pests, diseases, or nutritional deficiencies in real-time. This proactive monitoring reduces crop loss and improves yield quality, directly contributing to higher profitability.\n\nOptimize Growth Factor

##### <font color="blue">Loading a .pdf file</font>

In [87]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("./data/AGEYE-Truly-Intelligent-Farming.pdf")

loaded_data = loader.load_and_split()

In [88]:
loaded_data

[Document(metadata={'source': './data/AGEYE-Truly-Intelligent-Farming.pdf', 'page': 0}, page_content='AGEYE Develops Next Generation Technologies\nfor Sustainable Food ProductionWelcome to the\nof Indoor Farming.Future\nCopyright AGEYE Technologies, Inc. 2024. All Rights Reserved.'),
 Document(metadata={'source': './data/AGEYE-Truly-Intelligent-Farming.pdf', 'page': 1}, page_content='Our MissionOur Goal is Clear:  To ensure fresh, nutritious, \nand sustainable food that is accessible to all.\nAGEYE believes in a future where \naccess to healthy and nutritious \nfood is not a luxury, but reality. By \nmaximizing yields and ensuring \nconsistent quality, we’re paving the \nway for sustainable food systems \nand taking proactive steps towards \neliminating global hunger.\nThrough our comprehensive \nmanagement systems, indoor \nfarms can produce cleaner, \npesticide-free, and nutrient-rich \nproduce, fostering a healthier global \npopulation and ensuring well-being \nfor all ages.\nCopyri

In [89]:
loaded_data[0].page_content

'AGEYE Develops Next Generation Technologies\nfor Sustainable Food ProductionWelcome to the\nof Indoor Farming.Future\nCopyright AGEYE Technologies, Inc. 2024. All Rights Reserved.'

#### <font color="green">Loading a Wikipedia page and asking question about it</font>

In [None]:
# !pip install wikipedia

In [115]:
from langchain_community.document_loaders import WikipediaLoader

loader = WikipediaLoader('query=nvidia, load_max_docs=1')

loaded_data = loader.load()[0].page_content

In [116]:
loaded_data

'Feature levels in Direct3D define strict sets of features required by certain versions of the Direct3D API and runtime, as well as additional optional feature levels available within the same API version.\n\n\n== Overview ==\n\nFeature levels encapsulate hardware-specific capabilities that exist on top of common mandatory requirements and features in a particular version of the API.  The levels are grouped in strict supersets of each other, so each higher level includes all features required on every lower level.\nSome feature levels include previously optional hardware features which are promoted to a mandatory status with new revisions of the API to better expose newer hardware. More advanced features such as new shader models and rendering stages are only exposed on up-level hardware, however the hardware is not required to support all of these feature levels and the Direct3D runtime will make the necessary translations.\nFeature levels allow developers to unify the rendering pipel

In [124]:
from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate.from_messages(
    [
        ("human", "Answer this {question}, here is some extra {context}"),
    ]
)

message = chat_template.format_messages(
    name = "Cuisine",
    question="What is a spinach?",
    context=loaded_data
)

In [125]:
response = chatModel.invoke(message)

In [126]:
response.content

'I think there may be some confusion here!\n\nThe text you provided is about Direct3D, a graphics API, and its feature levels. It has nothing to do with spinach, which is a leafy green vegetable commonly used in cooking.\n\nSo, to answer your question: Spinach is a type of leafy green vegetable that is rich in nutrients and is commonly used in salads, smoothies, and cooked dishes. It is not related to Direct3D or feature levels in any way.'