# Installing the **Undatasio** Python API library

In [1]:
# install undatasio
!pip install -U undatasio



## To import an **UnDataIO** object, you need a token and an optional task name from the Undatasio platform.

In [2]:
from undatasio.undatasio import UnDatasIO

undatasio_obj = UnDatasIO('undatasio token ...')

## The **show_version** function of the generated Undatasio object can display all version information and file lists for the current token's task name.

In [None]:
version_data = undatasio_obj.show_version()
version_data

## The **get_result_to_langchain_document** function of the Undatasio object returns a Langchain Document object. Parameters for this function can be gleaned from the data returned by the **show_version** function.

In [4]:
lc_document = undatasio_obj.get_result_to_langchain_document(
    type_info=['text'],
    file_name='1d8c9bc374114b6e901da.pdf',
    version='v26'
)
lc_document



## Use **RecursiveCharacterTextSplitter** from **langchain_text_splitters** to split the text returned by the **get_result_to_langchain_document** function.

In [5]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=100,
    chunk_overlap=20,
    length_function=len,
    is_separator_regex=False,
)
texts = text_splitter.split_documents([lc_document])
texts

[Document(metadata={'source': '_v26_1d8c9bc374114b6e901da.pdf_[text]'}, page_content='2. Profit-takingxpullback has developed into a minor correction. While the10\\%'),
 Document(metadata={'source': '_v26_1d8c9bc374114b6e901da.pdf_[text]'}, page_content='correction Since mid-May is broadly in-line with the historical norms of most technical'),
 Document(metadata={'source': '_v26_1d8c9bc374114b6e901da.pdf_[text]'}, page_content='bull runs, the six-week market weakness has prompted increasing investor questions'),
 Document(metadata={'source': '_v26_1d8c9bc374114b6e901da.pdf_[text]'}, page_content='about the strengthof thepolicy put, and concerns regarding a redux of the powerful but'),
 Document(metadata={'source': '_v26_1d8c9bc374114b6e901da.pdf_[text]'}, page_content='short-livedRe0peningrallyinlate2022/early2023.Empirically,inthe23episodesinthe'),
 Document(metadata={'source': '_v26_1d8c9bc374114b6e901da.pdf_[text]'}, page_content='past 20 years where MSCl China rallied more than20\\