# NLP - Session 2 - Working with Text files 

#### Working with the text files
 - Working with f-strings for formated print
 - Working with .CSV, .TSV files to read and write
 - Working with %%writefile to create simple .txt files [works in jupyter notebook only]
 - Working with Python’s inbuilt file read and write

## String Formatter

In [None]:
name = "Iron Man"

In [None]:
print("I am the {}".format(name))

In [None]:
print(f"I am the {name}")

## Minimum Width and Allignment Between Column

In [None]:
ds_tutes = [
    ("Python for beginners", 19),
    ("Fearure Selection for ML", 11),
    ("ML Tutorial", 11),
    ("Deep Learning Tutorials", 19),
]

In [None]:
ds_tutes

In [None]:
for info in ds_tutes:
    print(info)

In [None]:
# Now using formatting
for info in ds_tutes:
    print(f"{info[0]:{50}} {info[1]:{10}}")  # 50 is the space between 2 columns

## Using >, <, ^
 - :< - Left Allign
 - :> - Right Allign
 - :^ - Center Allign

In [None]:
print("Left Allign")
print()
for info in ds_tutes:
    print(f"{info[0]:<{50}} {info[1]:{10}}")

print()
print("Right Allign")
print()
for info in ds_tutes:
    print(f"{info[0]:>{50}} {info[1]:{10}}")

print()
print("Center Allign")
print()
for info in ds_tutes:
    print(f"{info[0]:^{50}} {info[1]:{10}}")

## Working with .tsv and .csv files

In [None]:
import pandas as pd

In [None]:
data = pd.read_csv("data/nlp-spam.tsv", sep="\t")
data.head()

In [None]:
data.shape

In [None]:
data["label"].value_counts()

In [None]:
ham = data[data["label"] == "ham"]
ham.head()

In [None]:
ham.to_csv("data/ham.tsv", sep="\t")

In [None]:
ham.to_csv("data/ham.csv")

As we can see below there is an additional column for index added when saving it to csv. We can avoid it by ignoring index.

In [None]:
pd.read_csv("data/ham.tsv", sep="\t")

Ignore index.

In [None]:
ham.to_csv("data/ham.tsv", sep="\t", index=False)

In [None]:
ham.to_csv("data/ham.csv", index=False)

Now we can see the index are not added.

In [None]:
pd.read_csv("data/ham.tsv", sep="\t")

## Built in magic command in jupyter %%writefile

In [None]:
%%writefile data/nlp_ses_2.txt
Hello, this is the NLP Lesson
Please like and subscribe to show your support

Append lines to above file.

In [None]:
%%writefile -a data/nlp_ses_2.txt
Thanks for watching

## Read and Write using Python inbuilt command

### open()

In [None]:
file = open("data/nlp_ses_2_1.txt", "r")

In [None]:
file

### read()

In [None]:
file.read()

Trying ro read again will print empty string because the read puts the cursor at the end of the file. So after first read there is no more lines to read.

In [None]:
file.read()

Lets put the cursor back to the starting point.

### seek()

In [None]:
file.seek(0)

Now lets ry to read again. And since the cursor is put back to the starting position we should be able to read lines again.

In [None]:
file.read()

In [None]:
file.seek(0)

### readline()
Reads the file line by line.

In [None]:
file.readline()

In [None]:
file.readline()

In [None]:
file.readline()

In [None]:
file.seek(0)

### readlines()
Reads all lines at once.

In [None]:
file.readlines()

### close()

In [None]:
file.close()

Using `with open` will not require closing the file. As it automatically closes the file.

In [None]:
with open("data/nlp_ses_2.txt") as file:
    text_data = file.readlines()
    print(text_data)

In [None]:
for temp in text_data:
    print(temp)

### strip()
As you can see above there is a new line added after each line. It can be avoided using strip()

In [None]:
for temp in text_data:
    print(temp.strip())

### enumerate()

In [None]:
for i, temp in enumerate(text_data):
    print(str(i) + "  ----->  " + temp.strip())

### File writing

In [None]:
file = open("data/nlp_ses_2_2.txt", "w")

In [None]:
file

In [None]:
file.write("This is just another way to write")

At this point the file is not yet finished writing. We need to call close() to finish writing.

In [None]:
file.close()

The entire file writing can be done without adding additional open() and close() as shown below.

In [None]:
with open("data/nlp_ses_2_3.txt", "w") as file:
    file.write("This is just another way to write")

Now lets append without opening and closing the file seperatly.

In [None]:
with open("data/nlp_ses_2_3.txt", "a") as file:
    file.write("This is just another way to write")

In [None]:
with open("data/nlp_ses_2_3.txt", "a") as file:
    for temp in text_data:
        file.write("This is just another way to write")