# LarkSuite (FeiShu)

>[LarkSuite](https://www.larksuite.com/) is an enterprise collaboration platform developed by ByteDance.

This notebook covers how to load data from the `LarkSuite` REST API into a format that can be ingested into LangChain, along with example usage for text summarization.

The LarkSuite API requires an access token (tenant_access_token or user_access_token), checkout [LarkSuite open platform document](https://open.larksuite.com/document) for API details.

In [5]:
from getpass import getpass

from langchain_community.document_loaders.larksuite import (
    LarkSuiteDocLoader,
    LarkSuiteWikiLoader,
)
# Parse Lark Suite (Feishu) document URL
from urllib.parse import urlparse, parse_qs

def parse_lark_url(url):
    parsed_url = urlparse(url)
    path_segments = parsed_url.path.split('/')
    
    domain = parsed_url.netloc
    document_id = path_segments[-1] if len(path_segments) > 1 else None
    
    return {
        'domain': f'https://{domain}',
        'document_id': document_id
    }

# Example usage
lark_url = "https://nd9fgiy0w0.feishu.cn/docx/JfjjdJgTuoc484x8HFGcdzAinNe"
lark_config = parse_lark_url(lark_url)

print("Parsed Lark Suite configuration:")
print(f"Domain: {lark_config['domain']}")
print(f"Document ID: {lark_config['document_id']}")

# Update the variables with parsed values
DOMAIN = lark_config['domain']
DOCUMENT_ID = lark_config['document_id']

# DOMAIN = input("larksuite domain")
ACCESS_TOKEN = getpass("larksuite tenant_access_token or user_access_token")
# DOCUMENT_ID = input("larksuite document id")

Parsed Lark Suite configuration:
Domain: https://nd9fgiy0w0.feishu.cn
Document ID: JfjjdJgTuoc484x8HFGcdzAinNe


## Load From Document

In [8]:
from pprint import pprint

larksuite_loader = LarkSuiteDocLoader(DOMAIN, ACCESS_TOKEN, DOCUMENT_ID)
docs = larksuite_loader.load()

# print(docs)

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

## Load From Wiki

In [None]:
from pprint import pprint

DOCUMENT_ID = input("larksuite wiki id")
larksuite_loader = LarkSuiteWikiLoader(DOMAIN, ACCESS_TOKEN, DOCUMENT_ID)
docs = larksuite_loader.load()

pprint(docs)

[Document(page_content='Test doc\nThis is a test wiki doc.\n', metadata={'document_id': 'TxOKdtMWaoSTDLxYS4ZcdEI7nwc', 'revision_id': 15, 'title': 'Test doc'})]


In [None]:
# see https://python.langchain.com/docs/use_cases/summarization for more details
from langchain.chains.summarize import load_summarize_chain
from langchain_community.llms.fake import FakeListLLM

llm = FakeListLLM()
chain = load_summarize_chain(llm, chain_type="map_reduce")
chain.run(docs)