## **Overview of Document Loaders in LangChain**

Document loaders enable loading data from various sources as structured documents, providing a `load` method for fetching data and optionally supporting lazy loading for memory efficiency.

---

1. **Text Loaders**  
Load plain text files.  
    ```python
    from langchain_community.document_loaders import TextLoader
    loader = TextLoader("./example_data/example_text.md")
    loader.load()
    ```

---

2. **CSV Loaders**  
Parse data from CSV files.  
    ```python
    from langchain_community.document_loaders.csv_loader import CSVLoader
    loader = CSVLoader(file_path='./example_data/file_name.csv')
    data = loader.load()
    ```

---

3. **HTML Loaders**  
Extract content from HTML files.  
    ```python
    from langchain_community.document_loaders import UnstructuredHTMLLoader
    loader = UnstructuredHTMLLoader("example_data/fake-content.html")
    data = loader.load()
    ```
    `Note: Make sure to install: pip install "unstructured[all-docs]"`
---

4. **Web Base Loader**  
Scrape data from web pages using BeautifulSoup.  
    ```python
    from langchain_community.document_loaders import WebBaseLoader
    loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
    data = loader.load()
    ```

---

5. **JSON Loaders**  
Extract specific fields from JSON files using a schema (requires `jq`).  
    ```python
    from langchain_community.document_loaders import JSONLoader
    loader = JSONLoader(
        file_path='./example_data/facebook_chat.json',
        jq_schema='.messages[].content',
        text_content=False
    )
    data = loader.load()
    ```
    `Note: Make sure to install: !pip install jq`
---

6. **PDF Loaders**  
Split and load text from PDF files (requires `pypdf`).  
    ```python
    from langchain_community.document_loaders import PyPDFLoader
    loader = PyPDFLoader("example_data/example-paper.pdf")
    pages = loader.load_and_split()
    ```
    `Note: Make sure to install: !pip install pypdf`
---

7. **File Directory Loaders**  
Load multiple documents from a directory, filtering by file type.  
    ```python
    from langchain_community.document_loaders import DirectoryLoader, TextLoader
    loader = DirectoryLoader('../', glob="**/*.md", show_progress=True, loader_cls=TextLoader)
    docs = loader.load()
    ```
---

These loaders streamline the ingestion of data from various formats, enabling easy integration into the LangChain pipeline.