### Installation

LangTorch works with Python 3.8 or higher. To install LangTorch using pip:

In [None]:
!pip install langtorch

To us the OpenAI API, you need to set the `OPENAI_API_KEY` environment variable. You can find your API key in the OpenAI dashboard.

In [None]:
import os
os.environ["OPENAI_API_KEY"] = "your_api_key"

## 1. Text Tensors and Simple chains

### Getting tensor text data

`TextTensors` are designed to simplify simultaneous handling of text data. They inherit most functionality from PyTorch's `torch.Tensor` but hold textual data, which you may create from documents, prompt templates, completion dictionaries and more.

In [None]:
from langtorch import TextTensor

prompts = TextTensor([["This is the first prompt, with {field} to fill in."],
                                      ["This is another prompt, also with a {field}."   ]])
print(f"prompts (shape={tuple(prompts.shape)}):\n{prompts},\n")

completion = TextTensor([{"field": "data"}])
print(f"completion (shape={tuple(completion)}):\n{completion}\n")

two_completions = TextTensor([{"field": "data"}, {"field": "value"}])
print(f"two_completions (shape={tuple(two_completions.shape)}):\n{two_completions})\n")

prompts (shape=(2, 1)):
[[This is the first prompt, with field to fill in.]
 [This is another prompt, also with a field.]],

completion := TextTensor([ data ], shape=torch.Size([1]))

two_completions := TextTensor([ data   value ], shape=torch.Size([2]))



### TextTensor operations for prompt templating

`TextTensors` provide operations that allow for formatting and editing many entries at the same time according to array broadcasting rules. Adding a `TextTensors` appends its content to the other, while multiplication performs a more complex operation that can be used for template formatting:

In [None]:
lines = TextTensor(open('paper.md','r').readlines())
lines = lines + " Have a great day!"
lines

In [None]:
completions = TextTensor([[{"name": "Luciano"}],
                          [{"name": "Massimo"}]])
print(completions)

In [None]:
prompts = TextTensor(["Hello, {name}!"]) * completions


A useful informal definition for the multiplication operation is that when two entries are multiplied, the right Text acts like a format operation: replacing keys with values (here, {name} with Luciano) or appending if there is nothing to replace. For a more in depth look, see [TextTensor Multiplication](langtorch.org/reference/multiplication).
### Performing a task with TextModules


TextModules like `nn.Module` implement a forward method that works on (text) tensors. By default, they can be initialized by passing a TextTensor of prompts, that in the forward pass will be formatted using the input TextTensor (just like in the example above).  

To achieve interesting behavior, an nn.Module layer usually ends with passing the multiplied tensors to an "activation function". By analogy, TextModules usually end with an activation of an LLM call on the formatted prompts (for more on this parallel see. [langtorch.tt](langtorch.org/reference/tt)). LangTorch activation like `OpenAI` execute their LLM calls on each entry of the input TextTensor in parallel.

In [None]:
from langtorch import TextModule, OpenAI
llm = OpenAI("gpt4", T = 0.) # Pass any API kwargs here to customize the call
translate = TextModule("Translate this text to Polish: {}", activation=llm)


In [None]:
output = translate(prompts)


## 2. Implementing popular methods
### Parallel and Chained calls with TextModules
LangTorch uses a custom implementation to speed up and cache api calls, that by default run in parallel for all TextTensor entries passed to an LLM activation. As such, running calls in parallel is done automatically if either multiple prompts, multiple input values or both are passed to an LLM.
The simplest way to chain TextModule is to directly use `torch.nn.Sequential`. To create any complex chain you may, as in torch, define a module subclass that adds custom behavior or combines many submodules in one.  We will show these on examples of popular LLM methods.

### Chain of Thought
The simplest example of a custom module are those that implement prompting methods like Chain of Thought, where all we need is to append a fixed string to the input. This can be done by creating a reusable  TextModule that we can chain with any task module. Let's define a module with a task prompt template:

In [None]:
some_prompt_template = "Solve this equation: {}\n"
task = TextModule(some_prompt_template)
print(task(TextTensor("2+2 =")))

`{}` in a prompt template is a positional argument, that accepts one entry from the input. We can have multiple such placeholders in one prompt, if the input consists of many entries, e.g. `input = TextTensor([{"key1":"text1", "key2":"text2"}])`.

For our chain of thought module we should use the placeholder `{*}`, which is a "wildcard" key that places all of the input entries in it's place (here, the "Solve this equation" prompt and its completion).

In [None]:
import torch
chain_of_thought = TextModule("{*} Let's think step by step.")

task_module_w_CoT = torch.nn.Sequential(
    task,
    chain_of_thought,
    OpenAI("gpt-3.5-turbo")  # We end with an OpenAI model call. We could omit the model name as this is the default OpenAI model
)

input_tensor = TextTensor(["170*32 =", "4*20 =", "123*45/10 =", "2**10*5 ="])
output_tensor = chain(input_tensor)
output_tensor

As in PyTorch, we can also create a class that can implement a `forward` and `__init__` methods.

In [None]:
class ChainOfThought(TextModule):
    def forward(self, input):
        return super()(input + " Let's think step by step.")

task = ChainOfThought(some_prompt_template, activation=llm)


### Ensemble / Self-consistency
Many benefits of being able to represent texts "geometrically" in a matrix / tensor comes from being able to create a meaningful structure, where e.g. a 2d matrix has columns representing different versions of the same text and subsequent entries represent subsequent paragraphs. Methods like ensemble voting and self-consistency require creating multiple completions for the same task, which can be representing by adding such a "version" dimension.
In this example, we will build such a module that for each entry creates multiple answer entries and combines them back together to increase the overall performance. First, to add a new dimension with different answers given by the LLM we need only adjust the `n` parameter. Additionally, we can set the system message:


In [None]:
from langtorch import OpenAI
ensemble_llm = OpenAI("gpt-3.5-turbo",
                      system_message="You are a rewriting bot that answers only with the revised text",
                      T=1.1, # High temperature to sample diverse completions
                      n = 5) # 5 completions for each entry


To use a concrete example, we will write a module that uses an ensemble to compress a text paragraph by paragraph. The task description is inspired by the Chain of Density method. For now, let's assume `paragraphs` is defined as a TextTensor with 15 paragraph-entries:


In [None]:
rewrite = TextModule(["Compress all information from the paragraph into an entity-dense telegraphic summary: "], activation = ensemble_llm)
ensemble_summaries = rewrite(paragraphs)
print(ensemble_summaries.shape)


In [None]:
combined_summaries = langtorch.mean(ensemble_summaries, dim=-1)
print(combined_summaries.shape)


In [None]:
summary = langtorch.mean(combined_summaries, dim=-1)
print(summary)


Similar approaches can be used for more complicated ensemble methods or combined with methods like chain of thought to increase accuracy with "self-consistency".

### Working with structured documents
 Instead of strings, each entry of a TextTensor is an instance of [`langtorch.Text`](reference/text), which allows for more complex text processing. The `Text` class can load documents, parse most markup languages and provide a helpful interface for accessing and modifying their structured text segments. We will prepare data for a rewrite task like before by parsing a markdown file of a paper on the abilities of language models, available here [paper.md](/static/paper.txt){:download="paper.md"}. As the text has headers and other text blocks, we'll to select only paragraphs, which can be done with `iloc` and `loc` accessors:  


In [None]:
!wget https://raw.githubusercontent.com/yourusername/langtorch/main/paper.md

In [None]:
from langtorch import Text
paper = Text.from_file("paper.md")
first_block = paper.iloc[0]


In [None]:
print(set(paper.keys()))


In [None]:
paragraphs = paper.loc["Para"]
rewritten_paragraphs = rewrite(paragraphs)
paper.loc["Para"] = rewritten_paragraphs
print(paper)


## 3. Using Tensor Embeddings to Build Retrievers
Using embeddings with TextTensors is extremely easy, as every TextTensor can generate its own embedding, as well as know to automatically act as if it was an embedding tensor when passed to torch functions like cosine similarity. These representations (available under the `.embedding` attribute) are moreover automatically created only right before they are needed (via a set embedding model, by default OpenAI's `text-embedding-3-small`).


In [None]:
import torch

tensor1 = TextTensor([[["Yes"], ["No"]]])
tensor2 = TextTensor(["Yeah", "Nope", "Yup", "Non"])

torch.cosine_similarity(tensor1,tensor2)


### Build Custom Retriever and RAG modules
Using how `TextTensor`s can automatically act as a`Tensor` of it's embeddings, we can very compactly implement e.g. a retriever, which for each entry in the input finds in parallel `k` entries with the highest cosine similarity among the documents it holds:


In [None]:
class Retriever(TextModule):
    def __init__(self, documents: TextTensor):
        super().__init__()
        self.documents = TextTensor(documents).view(-1)

    def forward(self, query: TextTensor, k: int = 5):
        cos_sim = torch.cosine_similarity(self.documents, query.reshape(1))
        return self.documents[cos_sim.topk(k)]


In [None]:
retriever = Retriever(open("doc.txt", "r").readlines())
query = TextTensor("How to build a retriever?")
print(retriever(query))


Note how the implementation didn't require us to learn about any new operations we would not find in regular PyTorch. One goal of LangTorch is to give developers control over these lower level operations, while being able to write compact code without a multitude of classes. For this reason implementations such as the retriever above are not pre-defined classes in the main package.  
We can now compose this module with a Module making LLM calls to get a custom Retrieval Augmented Generation pipeline:


In [None]:
class RAG(TextModule):
    def __init__(self, documents: TextTensor, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.retriever = Retriever(documents)

    def forward(self, user_message: TextTensor, k: int = 5):
        retrieved_context = self.retriever(user_message, k) +"\n"
        user_message = user_message + "\nCONTEXT:\n" + retrieved_context.sum()
        return super().forward(user_message)


In [None]:
rag_chat = RAG(paragraphs,
               prompt="Use the context to answer the following user query: ",
               activation="gpt-3.5-turbo")
assistant_response = rag_chat(user_query)


With only small modifications to the retriever this module could also perform batched inference — performing multiple simultaneous queries without much additional latency. Note, `prompt` and `activation` are arguments inherited from TextModule and need the `super().forward` call to work.
We are excited to see what you will build with LangTorch. If you want to share some examples or have any questions, feel free to ask on our [discord](https://discord.gg/jkreqtCCkv). In the likely event of encountering a bug send it on discord or post on the [GitHub Repo](https://github.com/AdamSobieszek/langtorch) and we will fix it ASAP.