<a href="https://colab.research.google.com/github/shabanakausar/shabanakausar/blob/main/Langchain_Text_Splitter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Text Splitters**
Once you've loaded documents, you'll often want to transform them to better suit your application. The simplest example is you may want to split a long document into smaller chunks that can fit into your model's context window. LangChain has a number of built-in document transformers that make it easy to split, combine, filter, and otherwise manipulate documents.

When you want to deal with long pieces of text, it is necessary to split up that text into chunks. As simple as this sounds, there is a lot of potential complexity here. Ideally, you want to keep the semantically related pieces of text together. What "semantically related" means could depend on the type of text. This notebook showcases several ways to do that.

* At a high level, text splitters work as following:

Split the text up into small, semantically meaningful chunks (often sentences).
Start combining these small chunks into a larger chunk until you reach a certain size (as measured by some function).
Once you reach that size, make that chunk its own piece of text and then start creating a new chunk of text with some overlap (to keep context between chunks).
That means there are two different axes along which you can customize your text splitter:

* How the text is split
* How the chunk size is measured


The input text is split based on a defined chunk size with some defined chunk overlap. Chunk Size is a length function to measure the size of the chunk. This is often characters or tokens.


Chunk Size and Chunk Overlap in Document Splitting
A chunk overlap is used to have little overlap between two chunks and this allows for to have some notion of consistency between 2 chunks. There are different types of splitters in Lang Chain.

#**Types of Text Splitters**
LangChain offers many different types of text splitters. These all live in the langchain-text-splitters package.

#**Recursive**:
A list of user defined characters		Recursively splits text. Splitting text recursively serves the purpose of trying to keep related pieces of text next to each other. This is the recommended way to start splitting text.

#**Character**
A user defined character		Splits text based on a user defined character. One of the simpler methods.

Split by character is the simplest method. This splits based on characters (by default “”) and measure chunk length by number of characters.

How the text is split:
* by single character.
* How the chunk size is measured: by number of characters.

#**HTML**
HTML specific characters: Splits text based on HTML-specific characters. Notably, this adds in relevant information about where that chunk came from (based on the HTML)

#**Markdown**
Markdown specific characters: Splits text based on Markdown-specific characters. Notably, this adds in relevant information about where that chunk came from (based on the Markdown)

#**Code**
Code (Python, JS) specific characters: Splits text based on characters specific to coding languages. 15 different languages are available to choose from.

#**Token**
Tokens Splits text on tokens. There exist a few different ways to measure tokens.HTML	HTML specific characters


#**Character Text Splitter**

In [29]:
%pip install -qU langchain-text-splitters

In [30]:
# This is a long document we can split up.
with open("/content/text.txt") as f:
    state_of_the_union = f.read()

In [31]:
from langchain_text_splitters import CharacterTextSplitter

text_splitter = CharacterTextSplitter(
    separator="\n\n",
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len,
    is_separator_regex=False,
)

In [32]:
texts = text_splitter.create_documents([state_of_the_union])
print(texts[0])

page_content='it is butterfly.\nthis is a lion.\ni love this.'


In [33]:
metadatas = [{"document": 1}, {"document": 2}]
documents = text_splitter.create_documents(
    [state_of_the_union, state_of_the_union], metadatas=metadatas
)
print(documents[0])

page_content='it is butterfly.\nthis is a lion.\ni love this.' metadata={'document': 1}


In [34]:
text_splitter.split_text(state_of_the_union)[0]

'it is butterfly.\nthis is a lion.\ni love this.'

In [35]:
text1 = """LangChain is a framework for developing applications powered by language models. \n
It enables applications that:\n
Are context-aware: connect a language model to sources of context (prompt instructions,\n
few shot examples, content to ground its response in, etc.)\n
Reason: rely on a language model to reason (about how to answer based on provided context,\n
what actions to take, etc.)\n
This framework consists of several parts.\n

LangChain Libraries: The Python and JavaScript libraries. Contains interfaces and integrations \n for a myriad of components,a basic run time for combining these \n
components into chains and agents, and off-the-shelf implementations of chains and agents. \n
LangChain Templates: A collection of easily deployable reference architectures for a wide variety of tasks.\n
LangServe: A library for deploying LangChain chains as a REST API.\n
LangSmith: A developer platform that lets you debug, test, evaluate,\n
and monitor chains built on any LLM framework and seamlessly integrates with LangChain.\n"""

In [36]:
text_splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=107,
    chunk_overlap=2,
    length_function=len,
    is_separator_regex=False,
)

In [37]:
text_splitter.split_text(text1)

['LangChain is a framework for developing applications powered by language models.',
 'It enables applications that:',
 'Are context-aware: connect a language model to sources of context (prompt instructions,',
 'few shot examples, content to ground its response in, etc.)',
 'Reason: rely on a language model to reason (about how to answer based on provided context,',
 'what actions to take, etc.)\nThis framework consists of several parts.',
 'LangChain Libraries: The Python and JavaScript libraries. Contains interfaces and integrations',
 'for a myriad of components,a basic run time for combining these',
 'components into chains and agents, and off-the-shelf implementations of chains and agents.',
 'LangChain Templates: A collection of easily deployable reference architectures for a wide variety of tasks.',
 'LangServe: A library for deploying LangChain chains as a REST API.',
 'LangSmith: A developer platform that lets you debug, test, evaluate,',
 'and monitor chains built on any LLM

In [38]:
texts = text_splitter.create_documents([text1])
print(texts[0])

page_content='LangChain is a framework for developing applications powered by language models.'


In [39]:
metadatas = [{"Langchain": 1}, {"Langchain": 2}]
documents = text_splitter.create_documents(
    [text1, text1], metadatas=metadatas
)
print(documents[0])

page_content='LangChain is a framework for developing applications powered by language models.' metadata={'Langchain': 1}


#**Recursively split by character**
This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough.

* The default list is ["\n\n", "\n", " ", ""]. * * This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.

* How the text is split: by list of characters.
* How the chunk size is measured: by number of characters.


In [41]:
# This is a long document we can split up.
with open("/content/text.txt") as f:
    state = f.read()

In [42]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [43]:
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size=100,
    chunk_overlap=20,
    length_function=len,
    is_separator_regex=False,
)

In [44]:
texts = text_splitter.create_documents([state])
print(texts[0])

page_content='it is butterfly.\nthis is a lion.\ni love this.'


In [45]:
text2 = """ Agents dynamically call tools. The results of those tool calls are added back to the prompt,\n
so that the agent can plan the next action. Depending on what tools are being used and how they’re being called,\n
the agent prompt can easily grow larger than the model context window.\n
With LCEL, it’s easy to add custom functionality for managing the size of prompts within your chain or agent.\n
Let’s look at simple agent example that can search Wikipedia for information.\n
Most LLM applications have a conversational interface. An essential component of a conversation is being able \n
to refer to information introduced earlier in the conversation. At bare minimum, a conversational system should\n
be able to access some window of past messages directly. A more complex system will need to have a world model \n
that it is constantly updating, which allows it to do things like maintain information about entities and their relationships.\n
We call this ability to store information about past interactions "memory".\n
LangChain provides a lot of utilities for adding memory to a system.\n
These utilities can be used by themselves or incorporated seamlessly into a chain.\n
"""



In [46]:
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size=100,
    chunk_overlap=10,
    length_function=len,
    is_separator_regex=False,
)

In [47]:
text_splitter.split_text(text2)

['Agents dynamically call tools. The results of those tool calls are added back to the prompt,',
 'so that the agent can plan the next action. Depending on what tools are being used and how they’re',
 'they’re being called,',
 'the agent prompt can easily grow larger than the model context window.',
 'With LCEL, it’s easy to add custom functionality for managing the size of prompts within your chain',
 'chain or agent.',
 'Let’s look at simple agent example that can search Wikipedia for information.',
 'Most LLM applications have a conversational interface. An essential component of a conversation is',
 'is being able',
 'to refer to information introduced earlier in the conversation. At bare minimum, a conversational',
 'system should',
 'be able to access some window of past messages directly. A more complex system will need to have a',
 'to have a world model',
 'that it is constantly updating, which allows it to do things like maintain information about',
 'about entities and their

In [48]:
metadatas = [{"Agent": 1}, {"Memory": 2}]
documents = text_splitter.create_documents(
    [text2, text2], metadatas=metadatas
)
print(documents[0])

page_content='Agents dynamically call tools. The results of those tool calls are added back to the prompt,' metadata={'Agent': 1}


In [49]:
text_splitter.split_text(text2)[0]

'Agents dynamically call tools. The results of those tool calls are added back to the prompt,'

## **Token splitting**

We can also split on token count explicity, if we want.

This can be useful because LLMs often have context windows designated in tokens.

Tokens are often ~4 characters.

#**Split by tokens**
Language models have a token limit. You should not exceed the token limit. When you split your text into chunks it is therefore a good idea to count the number of tokens. There are many tokenizers. When you count tokens in your text you should use the same tokenizer as used in the language model.

* tiktoken
* spaCy
* NLTK
* Hugging Face tokenizer



#tiktoken
tiktoken is a fast BPE tokenizer created by OpenAI.

We can use it to estimate tokens used. It will probably be more accurate for the OpenAI models.



How the text is split: by character passed in.
How the chunk size is measured: by tiktoken tokenizer.



In [51]:
!pip install langchain

Collecting langchain
  Downloading langchain-0.1.12-py3-none-any.whl (809 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m809.1/809.1 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting langchain-community<0.1,>=0.0.28 (from langchain)
  Downloading langchain_community-0.0.28-py3-none-any.whl (1.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m21.8 MB/s[0m eta [36m0:00:00[0m
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain)
  Downloading marshmallow-3.21.1-py3-none-any.whl (49 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.4/49.4 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain)
  Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
Collecting mypy-exten

In [52]:
from langchain.text_splitter import TokenTextSplitter

In [54]:
%pip install --upgrade --quiet langchain-text-splitters tiktoken

!pip install PyPDF



In [55]:
from langchain.document_loaders import PyPDFLoader

In [57]:
loader = PyPDFLoader("/content/MachineLearning-Lecture01.pdf")
pages = loader.load()

In [58]:
text_splitter = TokenTextSplitter(chunk_size=1, chunk_overlap=0)

In [59]:
text1 = "foo bar bazzyfoo"

In [60]:
text_splitter.split_text(text1)

['foo', ' bar', ' b', 'az', 'zy', 'foo']

In [61]:
text_splitter = TokenTextSplitter(chunk_size=10, chunk_overlap=0)

In [62]:
docs = text_splitter.split_documents(pages)

In [63]:
for i in range(len(docs)):
    print(docs[i])

page_content='MachineLearning-Lecture01  \n' metadata={'source': '/content/MachineLearning-Lecture01.pdf', 'page': 0}
page_content='Instructor (Andrew Ng):  Okay. Good' metadata={'source': '/content/MachineLearning-Lecture01.pdf', 'page': 0}
page_content=' morning. Welcome to CS229, the machine ' metadata={'source': '/content/MachineLearning-Lecture01.pdf', 'page': 0}
page_content='\nlearning class. So what I wanna do today' metadata={'source': '/content/MachineLearning-Lecture01.pdf', 'page': 0}
page_content=' is ju st spend a little time going over the' metadata={'source': '/content/MachineLearning-Lecture01.pdf', 'page': 0}
page_content=' logistics \nof the class, and then we' metadata={'source': '/content/MachineLearning-Lecture01.pdf', 'page': 0}
page_content="'ll start to  talk a bit about machine learning" metadata={'source': '/content/MachineLearning-Lecture01.pdf', 'page': 0}
page_content='.  \nBy way of introduction, my' metadata={'source': '/content/MachineLearning-Lecture01

In [64]:
pages[0].metadata

{'source': '/content/MachineLearning-Lecture01.pdf', 'page': 0}

In [65]:
text_splitter = TokenTextSplitter(chunk_size=10, chunk_overlap=0)

texts = text_splitter.split_text(text2)
print(texts[0])

 Agents dynamically call tools. The results of those tool


#**SpaCy**
SpaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython.

Another alternative to NLTK is to use spaCy tokenizer.

* How the text is split: by spaCy tokenizer.
* How the chunk size is measured: by number of characters.



In [66]:
%pip install --upgrade --quiet  spacy

In [67]:
# This is a long document we can split up.
with open("/content/stories.txt") as f:
    state_of_the_union = f.read()

In [68]:
from langchain_text_splitters import SpacyTextSplitter

In [69]:
text_splitter = SpacyTextSplitter(chunk_size=1000)

texts = text_splitter.split_text(state_of_the_union)
print(texts[0])


.

“The Happy Prince” by Oscar Wilde
English short stories“The Happy Prince” is a story that explores compassion in society, serving as a fairy tale and a fable at once.

It’s about a prince who is only allowed to see beauty and comfort in his life.

When he dies, he’s turned into a golden statue in his city, where he discovers that others actually live their lives in poverty and darkness.

With the help of a swallow (a type of bird), the prince manages to help people even after death.

Since the story is old, much of the English is outdated (not used in modern English).

Still, if you have a good grasp of the English language, you can use this story to give yourself a great reading challenge.

14.

“The Night Train at Deoli” by Ruskin Bond
The Night Train at DeoliRuskin Bond used to spend summers at his grandmother’s house in Dehradun, India.

While taking the train, he always had to pass through a small station called Deoli.




# **NLTK**
The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language.

Rather than just splitting on “”, we can use NLTK to split based on NLTK tokenizers.

How the text is split: by NLTK tokenizer.
How the chunk size is measured: by number of characters.


In [70]:
!pip install nltk



In [71]:
import nltk
nltk.download('punkt')


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

In [72]:
# This is a long document we can split up.
with open("/content/stories.txt") as f:
    state = f.read()


In [73]:
from langchain_text_splitters import NLTKTextSplitter

text_splitter = NLTKTextSplitter(chunk_size=1000)

In [74]:
texts = text_splitter.split_text(state)
print(texts[0])


.

“The Happy Prince” by Oscar Wilde
English short stories“The Happy Prince” is a story that explores compassion in society, serving as a fairy tale and a fable at once.

It’s about a prince who is only allowed to see beauty and comfort in his life.

When he dies, he’s turned into a golden statue in his city, where he discovers that others actually live their lives in poverty and darkness.

With the help of a swallow (a type of bird), the prince manages to help people even after death.Since the story is old, much of the English is outdated (not used in modern English).

Still, if you have a good grasp of the English language, you can use this story to give yourself a great reading challenge.

14.

“The Night Train at Deoli” by Ruskin Bond
The Night Train at DeoliRuskin Bond used to spend summers at his grandmother’s house in Dehradun, India.

While taking the train, he always had to pass through a small station called Deoli.


#**Hugging Face tokenizer**
Hugging Face has many tokenizers.

We use Hugging Face tokenizer, the GPT2TokenizerFast to count the text length in tokens.

How the text is split: by character passed in.
How the chunk size is measured: by number of tokens calculated by the Hugging Face tokenizer.


In [75]:
from transformers import GPT2TokenizerFast

In [76]:
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

In [77]:
# This is a long document we can split up.
with open("/content/stories.txt") as f:
    state = f.read()

In [78]:
from langchain_text_splitters import CharacterTextSplitter

text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(
    tokenizer, chunk_size=200, chunk_overlap=0
)


In [79]:
texts = text_splitter.split_text(state_of_the_union)

print(texts[0])

. “The Happy Prince” by Oscar Wilde
English short stories“The Happy Prince” is a story that explores compassion in society, serving as a fairy tale and a fable at once. It’s about a prince who is only allowed to see beauty and comfort in his life. When he dies, he’s turned into a golden statue in his city, where he discovers that others actually live their lives in poverty and darkness. With the help of a swallow (a type of bird), the prince manages to help people even after death.Since the story is old, much of the English is outdated (not used in modern English). Still, if you have a good grasp of the English language, you can use this story to give yourself a great reading challenge.


#**SentenceTransformers**
The SentenceTransformersTokenTextSplitter is a specialized text splitter for use with the sentence-transformer models. The default behaviour is to split the text into chunks that fit the token window of the sentence transformer model that you would like to use.



In [80]:
!pip install sentence-transformers



In [81]:
from langchain_text_splitters import SentenceTransformersTokenTextSplitter

In [82]:
splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0)
text = "Lorem the text "

In [83]:
count_start_and_stop_tokens = 2
text_token_count = splitter.count_tokens(text=text) - count_start_and_stop_tokens
print(text_token_count)

4


In [84]:
token_multiplier = splitter.maximum_tokens_per_chunk // text_token_count + 1

# `text_to_split` does not fit in a single chunk
text_to_split = text * token_multiplier

print(f"tokens in text to split: {splitter.count_tokens(text=text_to_split)}")

tokens in text to split: 390


In [85]:
text_chunks = splitter.split_text(text=text_to_split)

print(text_chunks[1])

lorem the text


#**Split code**
CodeTextSplitter allows you to split your code with multiple languages supported. Import enum Language and specify the language.

In [86]:
%pip install -qU langchain-text-splitters

In [87]:
from langchain_text_splitters import (
    Language,
    RecursiveCharacterTextSplitter,
)

In [88]:

# Full list of supported languages
[e.value for e in Language]

['cpp',
 'go',
 'java',
 'kotlin',
 'js',
 'ts',
 'php',
 'proto',
 'python',
 'rst',
 'ruby',
 'rust',
 'scala',
 'swift',
 'markdown',
 'latex',
 'html',
 'sol',
 'csharp',
 'cobol',
 'c',
 'lua',
 'perl']

In [89]:
# You can also see the separators used for a given language
RecursiveCharacterTextSplitter.get_separators_for_language(Language.PYTHON)

['\nclass ', '\ndef ', '\n\tdef ', '\n\n', '\n', ' ', '']

#Python
Here’s an example using the PythonTextSplitter:



In [90]:
PYTHON_CODE = """
def hello_world():
    print("Hello, World!")

# Call the function
hello_world()
"""


In [91]:
python_splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.PYTHON, chunk_size=50, chunk_overlap=0
)
python_docs = python_splitter.create_documents([PYTHON_CODE])
python_docs


[Document(page_content='def hello_world():\n    print("Hello, World!")'),
 Document(page_content='# Call the function\nhello_world()')]

#**JS**
Here’s an example using the JS text splitter:


In [92]:

JS_CODE = """
function helloWorld() {
  console.log("Hello, World!");
}

// Call the function
helloWorld();
"""

js_splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.JS, chunk_size=60, chunk_overlap=0
)
js_docs = js_splitter.create_documents([JS_CODE])
js_docs

[Document(page_content='function helloWorld() {\n  console.log("Hello, World!");\n}'),
 Document(page_content='// Call the function\nhelloWorld();')]

#**Markdown**
Here’s an example using the Markdown text splitter:


In [94]:
markdown_text = """
# 🦜️🔗 LangChain

⚡ Building applications with LLMs through composability ⚡

## Quick Install

```bash
# Hopefully this code block isn't split
pip install langchain
```

As an open-source project in a rapidly developing field, we are extremely open to contributions.
"""



In [95]:
md_splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.MARKDOWN, chunk_size=60, chunk_overlap=0
)
md_docs = md_splitter.create_documents([markdown_text])
md_docs

[Document(page_content='# 🦜️🔗 LangChain'),
 Document(page_content='⚡ Building applications with LLMs through composability ⚡'),
 Document(page_content='## Quick Install\n\n```bash'),
 Document(page_content="# Hopefully this code block isn't split"),
 Document(page_content='pip install langchain'),
 Document(page_content='```'),
 Document(page_content='As an open-source project in a rapidly developing field, we'),
 Document(page_content='are extremely open to contributions.')]

#**Semantic Chunking**
Splits the text based on semantic similarity.

Taken from Greg Kamradt’s wonderful notebook: https://github.com/FullStackRetrieval-com/RetrievalTutorials/blob/main/5_Levels_Of_Text_Splitting.ipynb

All credit to him.

At a high level, this splits into sentences, then groups into groups of 3 sentences, and then merges one that are similar in the embedding space.



In [96]:
#Install Dependencies
!pip install --quiet langchain_experimental langchain_openai

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m177.6/177.6 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m262.4/262.4 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m5.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.8/77.8 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[?25h

In [99]:
#Load Example Data
# This is a long document we can split up.
with open("/content/text.txt") as f:
    state_of_the_union = f.read()

In [105]:
#Create Text Splitter
from langchain_experimental.text_splitter import SemanticChunker
from langchain_openai.embeddings import OpenAIEmbeddings

text_splitter = SemanticChunker(OpenAIEmbeddings())

ValidationError: 1 validation error for OpenAIEmbeddings
__root__
  Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error)

In [102]:
#Split Text
docs = text_splitter.create_documents([state_of_the_union])
print(docs[0].page_content)


it is butterfly.
this is a lion.
i love this.
