### Importing Data Using Python’s open() Function

In [3]:
# use open() function

# text file (txt)
path = "./textfiles/excerpt.txt"
with open(path, "r") as f:
    content = f.read()
print(content)



Today, robots can talk to humans using natural language, and theyâ€™re getting smarter. Even so, very few people understand how these robots work or how they might use these technologies in their own projects.

Natural language processing (NLP) â€“ a branch of artificial intelligence that helps machines understand and respond to human language â€“ is the key technology that lies at the heart of any digital assistant product.


In [4]:
# print only non-empty lines and get line no
with open(path, "r") as f:
    for i, line in enumerate(f):    # enumerate add line no to each line
        if line.strip():
            print(f"Line: {i}, {line.strip()}")

Line: 0, Today, robots can talk to humans using natural language, and theyâ€™re getting smarter. Even so, very few people understand how these robots work or how they might use these technologies in their own projects.
Line: 2, Natural language processing (NLP) â€“ a branch of artificial intelligence that helps machines understand and respond to human language â€“ is the key technology that lies at the heart of any digital assistant product.


In [7]:
# Rather than print the lines, you can send them to a list by using a list comprehension
with open(path,"r") as f:  
    lst = [line.strip() for line in f if line.strip()]

print(lst)

["Today, robots can talk to humans using natural language, and they're getting smarter. Even so, very few people understand how these robots work or how they might use these technologies in their own projects.", 'Natural language processing (NLP) â€“ a branch of artificial intelligence that helps machines understand and respond to human language â€“ is the key technology that lies at the heart of any digital assistant product.']


### Tabular Data Files
### tabular data file is a file in which the data is structured into rows

In [18]:
# here comes CSV files
import csv
from pprint import pprint
path = "./csvfiles/cars.csv"
with open(path, "r") as csv_file:
    csv_reader = csv.DictReader(csv_file)
    cars = []
    for row in csv_reader:
        cars.append(dict(row))
pprint(cars)

[{'Make': 'Ford', 'Model': 'E350', 'Price': '3200.00', 'Year': '1997'},
 {'Make': 'Chevy', 'Model': 'Venture', 'Price': '4800.00', 'Year': '1999'},
 {'Make': 'Jeep',
  'Model': 'Grand Cherokee',
  'Price': '4900.00',
  'Year': '1996'}]


In [20]:
# alternatively, other than csv.dictreader, can use csv.reader
import csv
path = "./csvfiles/cars.csv"
with open(path, "r") as csv_file:
    csv_reader = csv.reader(csv_file)
    cars = []
    for row in csv_reader:
        cars.append(row)
print(cars)

[['Year', 'Make', 'Model', 'Price'], ['1997', 'Ford', 'E350', '3200.00'], ['1999', 'Chevy', 'Venture', '4800.00'], ['1996', 'Jeep', 'Grand Cherokee', '4900.00']]


In [25]:
# json file (exercise 5)
import json
path = path = "./jsonfiles/cars.json"
with open(path) as json_file:
    json_reader = json.load(json_file)
    for car in json_reader["cars"]:
        for key in car:
            print(f"{key}: {car[key]}")
        print()

Year: 1997
Make: Ford
Model: E350
Price: 3200.00

Year: 1999
Make: Chevy
Model: Venture
Price: 4800.00

Year: 1996
Make: Jeep
Model: Grand Cherokee
Price: 4900.00



In [28]:
# Binary Files
# executable (.exe) and image files (.jpeg, .bmp, and so on), which contain data in binary format, represented as a sequence of bytes. 
# Since these bytes are typically intended to be interpreted as something other than text characters, 
#   you can’t open a binary file in text mode to access and manipulate its content. 
# Instead, you must use the open() function’s binary mode.

image = "./binaryfiles/bentley.jpg"

with open(image,"rb") as image_file:
    content = image_file.read()
print(f"bytes: {len(content)}")

bytes: 111483
