# Q1. Describe the differences between text and binary files in a single paragraph.

A text file is a file that contains text in the form of characters, usually encoded in a specific format such as ASCII or Unicode. Text files can be easily opened and read by any text editor or word processor. On the other hand, binary files contain data in the form of bytes, which can represent any kind of information such as images, audio, or executable code. Binary files cannot be easily opened and read by humans, and usually require specialized software to interpret the data.

Code example:

To open a text file in Python, we can use the built-in open() function with the mode argument set to 'r' (read):

In [3]:
with open('file.txt', 'w') as f:
    f.write('This is a text file.')


In [4]:
with open('file.txt', 'r') as file:
    contents = file.read()
    print(contents)


This is a text file.


To open a binary file, we can use the 'rb' (read binary) mode:

In [10]:
import struct

with open('file.bin', 'wb') as f:
    n = 12345
    f.write(struct.pack('i', n))


In [14]:
with open('file.bin', 'rb') as file:
    data = file.read()
    print('Coded ->', data)
    print(struct.unpack('i', data))


Coded -> b'90\x00\x00'
(12345,)


# Q2. What are some scenarios where using text files will be the better option? When would you like to use binary files instead of text files?

Text files are better suited for storing data that can be represented as plain text, such as configuration files, log files, and data that needs to be human-readable or editable. Binary files, on the other hand, are used to store complex data types such as images, audio files, or executable code, where the structure of the data is not easily represented as text.

Code example:

In [15]:
# Write a text file
with open('file.txt', 'w') as file:
    file.write('This is a text file.\n')

# Read a text file
with open('file.txt', 'r') as file:
    contents = file.read()
    print(contents)

# Write a binary file
with open('file.bin', 'wb') as file:
    data = bytes([0x10, 0x20, 0x30, 0x40])
    file.write(data)

# Read a binary file
with open('file.bin', 'rb') as file:
    data = file.read()
    print(data)


This is a text file.

b'\x10 0@'


# Q3. What are some of the issues with using binary operations to read and write a Python integer directly to disc?

One issue with using binary operations to read and write integers directly to disk is that the endianness of the data may differ between different architectures or operating systems. This can lead to incorrect data being read or written if the program is run on a different system than the one it was written on. Another issue is that binary files are not human-readable, so debugging or editing the data can be difficult.

Code example:

In [16]:
# Write an integer to a binary file
with open('file.bin', 'wb') as file:
    data = 12345
    file.write(data.to_bytes(4, byteorder='big'))

# Read an integer from a binary file
with open('file.bin', 'rb') as file:
    data = int.from_bytes(file.read(4), byteorder='big')
    print(data)


12345


# Q4. Describe a benefit of using the with keyword instead of explicitly opening a file.

Using the with keyword to open a file has the benefit of automatically closing the file when the block of code is exited. This ensures that the file is properly closed and any resources associated with it are freed, even if an exception is raised during the execution of the block.

Code example:

In [18]:
# Using with statement to open a file
with open('file.txt', 'r') as f:
    data = f.read()
    print(data)


This is a text file.



# Q5. Does Python have the trailing newline while reading a line of text? Does Python append a newline when you write a line of text?

Answer: Yes, by default, Python includes the trailing newline character \n when reading a line of text using the readline() method. Similarly, when writing a line of text using the write() method, Python appends a newline character \n at the end of the line.

Example:

In [19]:
# Writing to a file
with open('example.txt', 'w') as f:
    f.write('hello\nworld\n')

# Reading from a file
with open('example.txt', 'r') as f:
    data = f.readline()
    print(data)  


hello



# Q6. What file operations enable for random-access operation?

Answer: In order to perform random access operations on a file, you need to use the seek() and tell() methods. The tell() method returns the current position of the file pointer, while the seek() method sets the position of the file pointer to a given offset.

Example:

In [20]:
# Opening a file in binary mode
with open('example.txt', 'rb') as f:
    # Move the file pointer to the 5th byte
    f.seek(4)
    # Read 1 byte from the file
    data = f.read(1)
    print(data)  
    # Get the current position of the file pointer
    pos = f.tell()
    print(pos)  


b'o'
5


# Q7. When do you think you'll use the struct package the most?

Answer: The struct package is used for working with structured binary data. This includes things like file formats, network protocols, and more. You would use the struct package when you need to read or write binary data in a specific format, such as a byte order, a fixed length, or a specific encoding.

Example:

In [21]:
import struct

# Packing data into a binary string
data = struct.pack('>i', 123)
print(data)  

# Unpacking binary data into variables
data = b'\x00\x00\x00{'
value = struct.unpack('>i', data)[0]
print(value)  


b'\x00\x00\x00{'
123


# Q8. When is pickling the best option?

Answer: Pickling is the best option when you need to store complex data structures, such as lists, dictionaries, or objects, to a file. Pickling converts the data into a binary format that can be easily saved to a file and loaded back into memory later.

Example:

In [22]:
import pickle

# define a complex object
my_object = {'a': [1, 2, 3], 'b': {'x': 1, 'y': 2}, 'c': (3, 4, 5)}

# serialize the object using pickle
serialized = pickle.dumps(my_object)

# deserialize the object
deserialized = pickle.loads(serialized)

print(deserialized) 


{'a': [1, 2, 3], 'b': {'x': 1, 'y': 2}, 'c': (3, 4, 5)}


# Q9. When will it be best to use the shelve package?

The shelve package is best used when you need a simple way to persistently store and retrieve Python objects. It's essentially a dictionary that can be stored on disk and accessed like a regular Python dictionary, with the added benefit of being able to handle more complex object types.

Code example:

In [23]:
import shelve

# create a new shelve database
with shelve.open('mydata.db') as db:
    # add some key-value pairs
    db['key1'] = 'value1'
    db['key2'] = [1, 2, 3]
    db['key3'] = {'a': 1, 'b': 2}

# retrieve the data from the shelve database
with shelve.open('mydata.db') as db:
    print(db['key1'])
    print(db['key2']) 
    print(db['key3']) 


value1
[1, 2, 3]
{'a': 1, 'b': 2}


# Q10. What is a special restriction when using the shelve package, as opposed to using other data dictionaries?

The shelve package has a special restriction in that the keys of the dictionary must be strings. This is because shelve uses the keys as file names to store the corresponding values on disk. If you try to use a key that is not a string, you will get a TypeError. This can be limiting if you need to use non-string keys to store your data. Additionally, the shelve package is not thread-safe, so it's important to ensure that it's only accessed by one thread at a time.

In [28]:
import shelve

# Open the shelve file
with shelve.open('my_shelf.db') as shelf:
    
    # Add key-value pairs to the shelf
    shelf['key1'] = {'name': 'John', 'age': 30}
    shelf['key2'] = {'name': 'Mary', 'age': 25}
    
    # Note: Trying to add a new key without opening the shelf with writeback=True
    # will result in an error, because shelve does not automatically update its
    # contents when values are modified.
    
    # Open the shelve file again with writeback=True
    with shelve.open('my_shelf.db', writeback=True) as shelf:
        
        # Modify an existing value
        shelf['key2']['age'] = 30
        
        # The modification is now saved to the shelf


error: [Errno 11] Resource temporarily unavailable: 'my_shelf.db'

In this example, we first open a shelve file named my_shelf and add two key-value pairs to it. We then open the file again with the writeback=True argument, which allows us to modify the values of the keys without explicitly overwriting them. We modify the age value of key2 and the modification is automatically saved to the shelf.