In [35]:
from dotenv import load_dotenv
load_dotenv('../.env')

True

# LLM Streaming 101

If you look at documentation for using LLM APIs in Python, you will see text from the API unpacked like in the following langchain example.

In [40]:
from langchain.llms import OpenAI
llm = OpenAI(model="gpt-3.5-turbo-instruct", temperature=0, max_tokens=512)
for chunk in llm.stream("Write me a song about stream buffering."):
    print(chunk, end="", flush=True)



Verse 1:
I'm sitting here, staring at my screen
Waiting for the video to start, it seems
But all I see is that spinning wheel
My patience is wearing thin, I can feel

Pre-Chorus:
I just wanna watch my favorite show
But this buffering is so slow
I'm getting frustrated, can't you see
Why won't this stream just let me be?

Chorus:
Stream buffering, it's such a pain
I just wanna watch, but it's all in vain
I'm stuck in this endless loop
Oh, stream buffering, what's the scoop?

Verse 2:
I've tried refreshing, I've tried to wait
But this buffering just won't abate
I've checked my connection, it's all good
So why won't this video play like it should?

Pre-Chorus:
I just wanna watch my favorite show
But this buffering is so slow
I'm getting frustrated, can't you see
Why won't this stream just let me be?

Chorus:
Stream buffering, it's such a pain
I just wanna watch, but it's all in vain
I'm stuck in this endless loop
Oh, stream buffering, what's the scoop?

Bridge:
I've been waiting for what

# The typewriter effect in Python

> I want to make a program which reads characters from a string and prints each character after some delay so it looks like a typewriter effect.

In [20]:
from time import sleep
def typewriter(text: str, per_char_delay: float = 0.045) -> None:
    for char in text:
        sleep(per_char_delay)
        print(char, end="", flush=True) # fancy print to unbuffer the output
        # wondering what is this combination of end="" and flush=True? 
        # keep reading this notebook. seasoned pythonistas can skip ahead.

In [22]:
typewriter("""
The golden moments in the stream of life rush past us, 
           and we see nothing but sand; the angels come to visit us, 
           and we only know them when they are gone.""")


The golden moments in the stream of life rush past us, 
           and we see nothing but sand; the angels come to visit us, 
           and we only know them when they are gone.

# Understanding streams and buffering in Python

## Concepts
- buffering: put part of a data stream in a buffer, at some point write the contents of the buffer to an output stream and "flush" the buffer.
    - a buffer is a region of memory used to temporarily hold data while it is being moved from one place to another.
- stream: raw bytes (`BufferedReader`, `BufferedWriter`) or text (`TextIOWrapper`). 
- `print()` function: the most common way of writing to a file stream in Python.
- `sys.stdout`: the standard output stream that `print()` writes to by default.
- `sys.stdout.flush()`: a call to flush the standard output stream.

## What is `print()`?
- The Python `print` function wraps the C function `PyFile_WriteObject`. [Implementation here](https://github.com/python/cpython/blob/0066ab5bc58a036b3f448cd6f9bbdd92120e39ba/Python/bltinmodule.c#L2014-L2106).
- `PyFile_WriteObject` writes an object to a _file stream_. 

### What is `print(end="", flush=True)`?

#### `end=""`
The `end` parameter of the `print()` function allows you to change the string that is appended to the end of the string that is printed. 

By default , `print()` appends a newline character to the end of the string it prints. This is because the `print()` function calls `PyFile_WriteObject` with the `Py_PRINT_RAW` flag set to `0`. This flag tells `PyFile_WriteObject` to append a newline character to the end of the string it writes to the file stream. You can see this in action [here](https://github.com/python/cpython/blob/0066ab5bc58a036b3f448cd6f9bbdd92120e39ba/Objects/fileobject.c#L108-L138).

#### `flush=True`
The `flush` parameter allows you to flush the file stream after the string is printed. 

In [33]:
def weird_typewriter_with_newline(text: str, per_char_delay: float = 0.045) -> None:
    for char in text:
        sleep(per_char_delay)
        print(char, flush=True) # no end="" here, so it unbuffers like before but defaults to printing a newline

In [34]:
weird_typewriter_with_newline("""
If you admire somebody, you should go ahead and tell them. 
           People never get the flowers while they can still smell them.""")



I
f
 
y
o
u
 
a
d
m
i
r
e
 
s
o
m
e
b
o
d
y
,
 
y
o
u
 
s
h
o
u
l
d
 
g
o
 
a
h
e
a
d
 
a
n
d
 
t
e
l
l
 
t
h
e
m
.
 


 
 
 
 
 
 
 
 
 
 
 
P
e
o
p
l
e
 
n
e
v
e
r
 
g
e
t
 
t
h
e
 
f
l
o
w
e
r
s
 
w
h
i
l
e
 
t
h
e
y
 
c
a
n
 
s
t
i
l
l
 
s
m
e
l
l
 
t
h
e
m
.


## What is a file stream?

In [12]:
import sys

# The default default file stream is sys.stdout 
text = "Never forget that only dead fish swim with the stream."
print(text)
sys.stdout.write(text + '\n'); 
# adding '\n' "flushes the buffer", which print adds by default

Never forget that only dead fish swim with the stream.
Never forget that only dead fish swim with the stream.


In [13]:
# change the file stream to sys.stderr
print("If it weren't for the rocks in its bed, the stream would have no song.", file=sys.stderr)

If it weren't for the rocks in its bed, the stream would have no song.


In [25]:
# change the file stream to a regular file
with open("regular-file.txt", 'w') as f: # more on what `open` is in the next section
    print("Life is an unending stream of extenuating circumstances.", file=f)

In [26]:
# look at the contents of the file
! cat regular-file.txt
! rm regular-file.txt # remove the file

Life is an unending stream of extenuating circumstances.


## The `sys.stdout` and `open` file streams

The default file stream is `sys.stdout` which is a `TextIOWrapper` object. The `TextIOWrapper` object wraps the `BufferedWriter` object which is a buffered writer. The `BufferedWriter` object wraps the `BufferedWriterRaw` object which is a buffered raw writer. The `BufferedWriterRaw` object wraps the `FileIO` object which is a file object. The `FileIO` object wraps the C `FILE` object which is a buffered writer.

Let's unpack this gibberish.

### Concepts
- [The `io` module](https://docs.python.org/3/library/io.html)
- What actually is `open` and how does it relate to `sys.stdout`? 

In [28]:
# make a TextIOWrapper
import io 

with open('something.txt', 'w') as f:
    print(type(f))
    print(isinstance(f, io.TextIOWrapper))
    
! rm something.txt # clean up the random file we created

<class '_io.TextIOWrapper'>
True


In [30]:
print(type(sys.stdout))
print(isinstance(sys.stdout, io.TextIOWrapper)) # why are you false? 🤔

# hmm, what is up with the output of this cell and title of this section?
# notice running this code has a different in a notebook vs. in a regular python repl 🥸
# open your terminal and try it after typing `python`.
# you will see that sys.stdout is in fact an instance of the same object that the `f` we created with `open` in the previous cell is.

# lesson: `sys.stdout` and `open` are both `TextIOWrapper` objects.

<class 'ipykernel.iostream.OutStream'>
False


## Text vs. everything else
- `TextIOWrapper` is a buffered text stream.
- `TextIOWrapper` wraps a `BufferedWriter`, which can work on any kind of data (e.g., images, audio, video). 

In [32]:
# open a file in write mode for text and binary data
for mode in ['w', 'wb']:
    with open('throwaway-file', mode) as f:
        print(type(f))
        print(isinstance(f, io.TextIOWrapper))
        print(isinstance(f, io.BufferedWriter))    

! rm throwaway-file # clean up the random file we created

<class '_io.TextIOWrapper'>
True
False
<class '_io.BufferedWriter'>
False
True
