# Writing data to text file

**There are a couple of ways to write data to a text file, one of whih is to 'print' the Python data and send the print output to an empty text file, rather than printing on the computer screen as is default.** 

In [1]:
# Python data

data = [
    "Andromeda - Shrub",
    "Bellflower - Flower",
    "China Pink - Flower",
    "Daffodil - Flower",
    "Evening Primrose - Flower",
    "French Marigold - Flower",
    "Hydrangea - Shrub",
    "Iris - Flower",
    "Japanese Camellia - Shrub",
    "Lavender - Shrub",
    "Lilac- Shrub",
    "Magnolia - Shrub",
    "Peony - Shrub",
    "Queen Anne's Lace - Flower",
    "Red Hot Poker - Flower",
    "Snapdragon - Flower",
    "Sunflower - Flower",
    "Tiger Lily - Flower",
    "Witch Hazel - Shrub",
]

In [2]:
# Writing to this file

plants_filename = 'plants_data.txt'

In [3]:
# Open text file with writing capability

with open(plants_filename, 'w') as plants:
    for plant in data:
        print(plant, file=plants)

**You can also write data to a text file using the `write()` method, which is called off the text file object, i.e. the object that is being written in.**

In [4]:
plants_filename = 'plants_data_2.txt'

with open(plants_filename, 'w') as plants:
    for plant in data:
        plants.write(plant)

**NOTE: Only strings can be written to text since only plain text is allowed in text files. Numbers must be converted to string format before using the `write()` method. There is no need when printing data since the `print()` function automatically converts numbers to strings.**

In [5]:
# NUMBERS PASS THROUGH

filename = 'printed_nos.txt'

with open(filename, 'w') as num_test:
    for i in range(10):
        print(i, file=num_test)

In [6]:
# NUMBERS DO NOT PASS THROUGH

filename = 'written_nos.txt'

with open(filename, 'w') as num_test:
    for i in range(10):
        num_test.write(i)

TypeError: write() argument must be str, not int

In [7]:
filename = 'written_nos.txt'

with open(filename, 'w') as num_test:
    for i in range(10):
        num_test.write(str(i))

## Reading and Writing to the same text file

**'Random access' is the term used for accessing parts of data from any part of the file. Any computer can read or write to any location in its memory (RAM stands for Random Access Memory) - it's the same thing. Using a FILE POINTER, you can move to different postions in the file. The file pointer keeps track of where you are in a file, like the cursor on a document.**

**When you write data to a file, it goes in at the current position of the file pointer, then moves forward, as a cursor does when you type. When reading in data from a file, it starts at the current position of the file pointer, and moves forward as well.**

**NOTE: Random access wth text files is very limited. You can only re-position the file pointer to the start or the end of the file.**

**The data for this exercise is lines of tab-separated values in CSV format. Python tends to convert tabs to whitespaces so be careful when writing data to the file.** 

    2022-0001	Squirrel Storage	132.50
    2022-0002	Squirrel Storage	45.30
    2022-0003	Squirrel Storage	834.25
    ...
    
**The data holds the invoice number, company name and the amount.**

**As an accountant, you need to update invoice file regularly so the aim is to input data for a new line, based on the last line in the file, in an efficient manner. You need to read in data AND write back to the file, so the file has to be opened in `r+` mode. You could, of course, use the `csv` module, but in this case, we will use functions to split the strings at each tabspace and parse the data.**

**Parse the data by extracting the fields, and splitting the invoice numbers into two parts: 2022 and 0001, i.e. year and number, in order to generate the next invoice number in increasing numerically by one. The number should reset at the start of each year.**

In [8]:
import datetime

# Import text constant 
from os import SEEK_SET
from typing import TextIO

In [9]:
def get_year() -> int:
    """Return the current year as an integer."""
    return datetime.datetime.now().year


In [10]:
def parse_invoice_number(invoice_number: str) -> tuple[int, int]:
    """Split a well-formed invoice "number" into its component parts.

    :param invoice_number: A string of the form YYYY-NNNN
        YYYY is the 4 digit year.
        NNNN is a 4 digit invoice number, left padded with zeros.
        The 2 parts are separated with a "-" character.
    :return: The returned tuple will contain:
        the 4 digit year as an integer,
        the invoice number as an integer.
    """
    year, number = invoice_number.split('-')
    return int(year), int(number)


In [11]:
def next_invoice_number(invoice_number: str) -> str:
    """ Produce the next invoice "number" in sequence.

    The format of `invoice_number` is described in `parse_invoice_number`.

    :param invoice_number: A string representing an invoice number.
    :return: A string representing the next invoice number.
        The numerical part will be incremented, unless the current year
        isn't the same as the year in `invoice_number`. In that case,
        the new invoice number will contain the current year, and the
        numerical part will be set to "0001".
    """
    invoice_year, invoice_num = parse_invoice_number(invoice_number)
    new_invoice_string = ''
    current_year = get_year()
    
    if invoice_year == current_year:
        invoice_num += 1
        new_invoice_string += str(current_year) + '-' + str(invoice_num).zfill(4)
    else:
        new_invoice_string += str(current_year) + '-0001'
        
    return(new_invoice_string)

In [12]:
# Test code for functions above

current_year = get_year()

# List of 4 tuples
# Each tuple contains: 
# test string, 
# correct output for parse_invoice_number function
# correct output for next_invoice_number function
test_data = [
    ('2019-0005', (2019, 5), f'{current_year}-0001'),
    (f'{current_year}-8514', (current_year, 8514), f'{current_year}-8515'),
    (f'{current_year}-0001', (current_year, 1), f'{current_year}-0002'),
    (f'{current_year}-0023', (current_year, 23), f'{current_year}-0024'),
]

# Unpack tuples and iterate through to test string against the correct outputs with functions
for test_string, result, next_number in test_data:
    # Test parse_invoice_number function
    parts = parse_invoice_number(test_string)
    if parts == result:
        print(f'{test_string} parsed successfully')
    else:
        print(f'{test_string} failed to parse. Expected {result} got {parts}')

    # Test next_invoice_number function
    new_number = next_invoice_number(test_string)
    if next_number == new_number:
        print(f'New number {new_number} generated correctly for {test_string}')
    else:
        print(f'New number {new_number} is not correct for {test_string}')

    print('-' * 80)



2019-0005 parsed successfully
New number 2024-0001 generated correctly for 2019-0005
--------------------------------------------------------------------------------
2024-8514 parsed successfully
New number 2024-8515 generated correctly for 2024-8514
--------------------------------------------------------------------------------
2024-0001 parsed successfully
New number 2024-0002 generated correctly for 2024-0001
--------------------------------------------------------------------------------
2024-0023 parsed successfully
New number 2024-0024 generated correctly for 2024-0023
--------------------------------------------------------------------------------


**The function to write a new line to the file takes in an opened file to get the last line of the document and then split it into three parts - the invoice number, the company name and the amount, using string methods.**

**The opened file should be read in line-by-line, to get the last line. You could read in all the lines at once, then slice the list to get the last line, but this takes up more memory, especially if you are dealing with a lot of data.**

In [13]:
def record_invoice(invoice_file: TextIO, 
                   company: str, 
                   amount: float) -> None:
    """Create a new invoice number, and write it to a file on disk.

    :param invoice_file: An open text file, opened using r+ mode
    :param company: The name of the company being invoiced
    :param amount: The amount on the invoice
    """
    last_row = ''
    
    for line in invoice_file:
        # Can remove after testing
        print('*', end='')
        last_row = line
    
    # If last row has content
    if last_row:
        invoice_number, c, a = last_row.split('\t')
        new_invoice_number = next_invoice_number(invoice_number)
    else:
        # i.e. if file is empty, start numbering from one
        year = get_year()
        new_invoice_number = "{year}-{1:04d}"
    
    print(f"{new_invoice_number}\t{company}\t{amount}", file=invoice_file)
    


In [14]:
# Open file with reading AND writing mode

data_file = 'data/invoices.csv'

with open(data_file, 'r+', encoding='utf-8', newline='') as invoices:
    record_invoice(invoices, 'Spaghetti Junction', 3400.00)
    

**************

**NOTE: You cannot call the `record_invoice` function more than once in the block of code above, because the file is already read in line-by-line during the first function call, so the file is considered empty because the loop is never executed and last row is reset to empty string. The new invoice number will start numbering from one, i.e. '*current_year*-0001'.**

**You can store the position of the file pointer *before* reading the file, at the start of the function. When you call the function again, you start reading from the stored position of the file pointer, rather than from the end of the file. With text files, this is limited moving the file pointer to the start of file, but that suits our purposes fine.** 

## Using `seek()` and `tell()` methods

**The `seek()` method moves the file pointer, and the `tell()` method retrieves the current position of the file pointer, although it is not used in code below.**

In [15]:
def record_invoice(invoice_file: TextIO, 
                   company: str, 
                   amount: float) -> None:
    """Create a new invoice number, and write it to a file on disk.

    :param invoice_file: An open text file, opened using r+ mode
    :param company: The name of the company being invoiced
    :param amount: The amount on the invoice
    """
    # Store file pointer startpoint
    start_line_pos = 0
    # Move to position relative to the start of file
    invoice_file.seek(start_line_pos, SEEK_SET)
    
    # Read from file pointer position, i.e. start of file
    last_row = ''
    for line in invoice_file:
        print('*', end='')
        last_row = line
    
    # If last row has content
    if last_row:
        invoice_number, c, a = last_row.split('\t')
        new_invoice_number = next_invoice_number(invoice_number)
    else:
        # i.e. if file is empty, start numbering from one
        year = get_year()
        new_invoice_number = "{year}-{1:04d}"
    
    print(f"{new_invoice_number}\t{company}\t{amount}", file=invoice_file)
    

In [16]:
with open(data_file, 'r+', encoding='utf-8', newline='') as invoices:
    record_invoice(invoices, 'Tongue Twister', 120.00)
    record_invoice(invoices, 'ACME Roadrunner', 45.22)

*******************************

In [17]:
with open(data_file, 'r+', encoding='utf-8', newline='') as invoices:
    record_invoice(invoices, 'Tongue Twister', 120.00)
    record_invoice(invoices, 'ACME Roadrunner', 45.22)
    record_invoice(invoices, 'Simply Red', 0.14)

******************************************************

**You can store the position of the last line of data, rather than storing the position at the start of the file. This means adding a new parameter to the function, to take in the stored position of the file pointer as an argument, which is recorded *before* writing a new line to the file.**

In [18]:
def record_invoice(invoice_file: TextIO, 
                   company: str, 
                   amount: float, 
                   last_line_pos: int = 0) -> int:
    """Create a new invoice number, and write it to a file on disk.

    :param invoice_file: An open text file, opened using 'r+'' mode.
    :param company: The name of the company being invoiced.
    :param amount: The amount on the invoice.
    :param last_line_pos: The position of the start of last line in file.
    Obtained from previous function call, or use default value of zero.
    :return: The position of the last line of data (file pointer) as 
    an integer, to be used by any subsequent function calls in 
    already-opened file.
    """
    # Move to last line position (relative to the start of file)
    invoice_file.seek(last_line_pos, SEEK_SET)
    
    # Loop through lines from last line position
    last_row = ''
    for line in invoice_file:
        print('*', end='')
        last_row = line
    
    # If last row has content
    if last_row:
        invoice_number, c, a = last_row.split('\t')
        new_invoice_number = next_invoice_number(invoice_number)
    else:
        # i.e. if file is empty, start numbering from one
        year = get_year()
        new_invoice_number = "{year}-{1:04d}"
        
    # Store last line position
    last_line_pos = invoice_file.tell()
    
    # Write new line to file
    print(f"{new_invoice_number}\t{company}\t{amount}", file=invoice_file)
    
    return last_line_pos
    


In [19]:
# Store each function call in a variable to store last line position

with open(data_file, 'r+', encoding='utf-8', newline='') as invoices:
    last_line = record_invoice(invoices, 'Tongue Twister', 120.00)
    last_line = record_invoice(invoices, 'ACME Roadrunner', 45.22, last_line)
    last_line = record_invoice(invoices, 'Simply Red', 0.14, last_line)

**********************