# Split by Number of Characters
[These notes](https://pypi.org/project/semantic-text-splitter/) from Pinecone provide some useful knowledge about this how it works.
This is the simplest method. This splits based on characters (by default "\n\n") and measure chunk length by number of characters.

Large language models (LLMs) can be used for many tasks, but often have a limited context size that can be smaller than documents you might want to use. To use documents of larger length, you often have to split your text into chunks to fit within this context size.

This crate provides methods for splitting longer pieces of text into smaller chunks, aiming to maximize a desired chunk size, but still splitting at semantically sensible boundaries whenever possible.

1. max_characters: Maximum number of characters in a chunk.
2. trim_chunks: It can set, the splitter not trim whitespace for you

In [None]:
%pip install -qU langchain-text-splitters

In [14]:
# This is a long document we can split up.
with open("../../state_of_the_union.txt") as f:
    state_of_the_union = f.read()

In [15]:
from langchain_text_splitters import SemanticCharacterTextSplitter

text_splitter = SemanticCharacterTextSplitter(
    max_characters=200,
    trim_chunks=True
)

In [16]:
texts = text_splitter.create_documents([state_of_the_union])
print(texts[0])
print(texts[1])

page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.'
page_content='Last year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans.'


In [17]:
text_splitter.split_text(state_of_the_union)[:2]

['Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.',
 'Last year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans.']