# 非结构化加载器

```{=mdx}

:::提示 兼容性

仅适用于Node.js。

:::

```

本笔记本提供了快速入门`UnstructuredLoader` [文档加载器](/docs/concepts/document_loaders)的概述。关于所有`UnstructuredLoader`功能和配置的详细文档，请前往[API参考](https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_unstructured.UnstructuredLoader.html)。

## 概览
### 集成细节

| 类 | 包 | 兼容性 | 本地 | [PY支持](https://python.langchain.com/docs/integrations/document_loaders/unstructured_file) | 
| :--- | :--- | :---: | :---: |  :---: |
| [UnstructuredLoader](https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_unstructured.UnstructuredLoader.html) | [@langchain/community](https://api.js.langchain.com/modules/langchain_community_document_loaders_fs_unstructured.html) | 仅限Node | ✅ | ✅ |

## 准备

要访问`UnstructuredLoader`文档加载器，你需要安装`@langchain/community`集成包，并创建一个Unstructured账户并获取API密钥。

### 本地运行

你可以使用Docker在本地计算机上运行Unstructured。为此，你需要安装Docker。你可以[在这里](https://docs.docker.com/get-docker/)找到安装Docker的说明。

```bash
docker run -p 8000:8000 -d --rm --name unstructured-api downloads.unstructured.io/unstructured-io/unstructured-api:latest --port 8000 --host 0.0.0.0
```

### 凭证

前往[unstructured.io](https://unstructured.io/api-key-hosted)注册Unstructured并生成API密钥。完成此操作后，请设置`UNSTRUCTURED_API_KEY`环境变量：

```bash
export UNSTRUCTURED_API_KEY="your-api-key"
```

### 安装

LangChain的UnstructuredLoader集成位于`@langchain/community`包中：

```{=mdx}
import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
import Npm2Yarn from "@theme/Npm2Yarn";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

<Npm2Yarn>
  @langchain/community @langchain/core
</Npm2Yarn>

```

## 实例化

现在我们可以实例化我们的模型对象并加载文档：

In [1]:
import { UnstructuredLoader } from "@langchain/community/document_loaders/fs/unstructured"

const loader = new UnstructuredLoader("../../../../../../examples/src/document_loaders/example_data/notion.md")

## 加载

In [2]:
const docs = await loader.load()
docs[0]

Document {
  pageContent: '# Testing the notion markdownloader',
  metadata: {
    filename: 'notion.md',
    languages: [ 'eng' ],
    filetype: 'text/plain',
    category: 'NarrativeText'
  },
  id: undefined
}


In [3]:
console.log(docs[0].metadata)

{
  filename: 'notion.md',
  languages: [ 'eng' ],
  filetype: 'text/plain',
  category: 'NarrativeText'
}


## 目录

你还可以使用 [`UnstructuredDirectoryLoader`](https://api.js.langchain.com/classes/langchain.document_loaders_fs_unstructured.UnstructuredDirectoryLoader.html) 来加载目录中的所有文件，它继承自 [`DirectoryLoader`](/docs/integrations/document_loaders/file_loaders/directory)：


In [5]:
import { UnstructuredDirectoryLoader } from "@langchain/community/document_loaders/fs/unstructured";

const directoryLoader = new UnstructuredDirectoryLoader(
  "../../../../../../examples/src/document_loaders/example_data/",
  {}
);
const directoryDocs = await directoryLoader.load();
console.log("directoryDocs.length: ", directoryDocs.length);
console.log(directoryDocs[0])


Unknown file type: Star_Wars_The_Clone_Wars_S06E07_Crisis_at_the_Heart.srt
Unknown file type: test.mp3


directoryDocs.length:  247
Document {
  pageContent: 'Bitcoin: A Peer-to-Peer Electronic Cash System',
  metadata: {
    filetype: 'application/pdf',
    languages: [ 'eng' ],
    page_number: 1,
    filename: 'bitcoin.pdf',
    category: 'Title'
  },
  id: undefined
}


## API 参考文档

如需详细了解所有 UnstructuredLoader 的功能和配置，请访问 API 参考文档：https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_unstructured.UnstructuredLoader.html