<a href="https://colab.research.google.com/github/ancestor9/Data-Analyst-with-Gemini-/blob/main/2%EC%9D%BC%EC%B0%A8/Data_Buffer_Stream.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Buffer** and **Stream** are concepts used in data handling, particularly in I/O operations.
- https://docs.python.org/3/library/io.html

<img src ="https://media.geeksforgeeks.org/wp-content/uploads/20240229180931/What-is-a-Buffer.webp">

### **Buffer**
- Buffer is a **temporary storage area that holds data** while it is being transferred from one place to another.
- It helps to manage the differences in speed between different components of a system (e.g., reading data from a disk and writing it to memory).
- Data is first written to a buffer and then processed, which minimizes the time-consuming operations directly on the data source.

### **Stream**
- Stream represents a **continuous flow of data**, which can be read from or written to incrementally.
- Streams can handle data that is too large to be loaded into memory all at once.
- They are often used for processing data that is coming from or going to a network, a file, or another I/O source.

## **1. Using Buffer with File I/O**
- Buffers are often used internally by I/O operations to improve performance.

In [1]:
import io

# Example of using a buffer

buffer = io.BytesIO()

buffer.write(b"Hello, this is a buffer example.")

# the write operation moves the file pointer to the end of the buffe
print(buffer.read())  # Read the content from the buffer

b''


### 왜 "Hello, this is a buffer example."를 읽지 못하지.
### 데이터는 흐르는 강물처럼!

In [2]:
buffer.seek(0)  # Move the cursor to the beginning of the buffer

print(buffer.read())

b'Hello, this is a buffer example.'


In [3]:
print(buffer.read())

b''


In [4]:
print(buffer.read())

b''


## **2. Using Stream with File I/O**
- Streams are typically used with files or network operations to read/write data incrementally.
- **Reading from a file using a stream:**

In [5]:
with open('example.txt', 'w') as f:
    f.write("This is an example text for the stream.\nAnother line of text\nhello world.")

with open('example.txt', 'r') as file_stream:
    print(file_stream)
    for line in file_stream:
        print(line.strip())  # Processing the file line by line (streaming)

<_io.TextIOWrapper name='example.txt' mode='r' encoding='utf-8'>
This is an example text for the stream.
Another line of text
hello world.


- **Writing to a file using a stream:**

In [6]:
with open('output.txt', 'w') as file_stream:
    for i in range(5):
        file_stream.write(f"This is line {i + 1}\n")

In [7]:
with open('output.txt', 'r') as file_stream:
    for i in range(5):
        print(file_stream)

<_io.TextIOWrapper name='output.txt' mode='r' encoding='utf-8'>
<_io.TextIOWrapper name='output.txt' mode='r' encoding='utf-8'>
<_io.TextIOWrapper name='output.txt' mode='r' encoding='utf-8'>
<_io.TextIOWrapper name='output.txt' mode='r' encoding='utf-8'>
<_io.TextIOWrapper name='output.txt' mode='r' encoding='utf-8'>


### **real example**

In [8]:
import nltk
nltk.download('gutenberg')

[nltk_data] Downloading package gutenberg to /root/nltk_data...
[nltk_data]   Unzipping corpora/gutenberg.zip.


True

In [10]:
from nltk.corpus import gutenberg
print(gutenberg.fileids())

['austen-emma.txt', 'austen-persuasion.txt', 'austen-sense.txt', 'bible-kjv.txt', 'blake-poems.txt', 'bryant-stories.txt', 'burgess-busterbrown.txt', 'carroll-alice.txt', 'chesterton-ball.txt', 'chesterton-brown.txt', 'chesterton-thursday.txt', 'edgeworth-parents.txt', 'melville-moby_dick.txt', 'milton-paradise.txt', 'shakespeare-caesar.txt', 'shakespeare-hamlet.txt', 'shakespeare-macbeth.txt', 'whitman-leaves.txt']


In [11]:
emma_text = gutenberg.raw('melville-moby_dick.txt')
print(emma_text[:1000])  # 처음 1000자만 출력해 보기

[Moby Dick by Herman Melville 1851]


ETYMOLOGY.

(Supplied by a Late Consumptive Usher to a Grammar School)

The pale Usher--threadbare in coat, heart, body, and brain; I see him
now.  He was ever dusting his old lexicons and grammars, with a queer
handkerchief, mockingly embellished with all the gay flags of all the
known nations of the world.  He loved to dust his old grammars; it
somehow mildly reminded him of his mortality.

"While you take in hand to school others, and to teach them by what
name a whale-fish is to be called in our tongue leaving out, through
ignorance, the letter H, which almost alone maketh the signification
of the word, you deliver that which is not true." --HACKLUYT

"WHALE. ... Sw. and Dan. HVAL.  This animal is named from roundness
or rolling; for in Dan. HVALT is arched or vaulted." --WEBSTER'S
DICTIONARY

"WHALE. ... It is more immediately from the Dut. and Ger. WALLEN;
A.S. WALW-IAN, to roll, to wallow." --RICHARDSON'S DICTIONARY


In [12]:
text = emma_text[:1000]
len(text)

1000

In [13]:
# text를 파일로 저정하기

with open('melville-moby_dick.txt', 'w') as f:
  f.write(text)

In [14]:
from nltk.tokenize import sent_tokenize

buffer_size = 1024

def process_file(file_path, buffer_size):
    with open(file_path, 'r', encoding='utf-8') as file:
        while True:
            buffer = file.read(buffer_size)  # 버퍼 크기만큼 읽기
            if not buffer:  # 읽을 내용이 없으면 반복 종료
                break
            print(buffer)  # 버퍼 출력


# 함수 호출, 버퍼 크기는 1024 바이트
process_file('melville-moby_dick.txt', buffer_size)


[Moby Dick by Herman Melville 1851]


ETYMOLOGY.

(Supplied by a Late Consumptive Usher to a Grammar School)

The pale Usher--threadbare in coat, heart, body, and brain; I see him
now.  He was ever dusting his old lexicons and grammars, with a queer
handkerchief, mockingly embellished with all the gay flags of all the
known nations of the world.  He loved to dust his old grammars; it
somehow mildly reminded him of his mortality.

"While you take in hand to school others, and to teach them by what
name a whale-fish is to be called in our tongue leaving out, through
ignorance, the letter H, which almost alone maketh the signification
of the word, you deliver that which is not true." --HACKLUYT

"WHALE. ... Sw. and Dan. HVAL.  This animal is named from roundness
or rolling; for in Dan. HVALT is arched or vaulted." --WEBSTER'S
DICTIONARY

"WHALE. ... It is more immediately from the Dut. and Ger. WALLEN;
A.S. WALW-IAN, to roll, to wallow." --RICHARDSON'S DICTIONARY


