## Write Delimited Strings into files

Let us understand how to write delimited strings into files. We will start with a collection or list of tuples and see how to convert to delimited strings before writing to a file.

Here are the steps involved to write list of tuples into file as delimited strings.
* Convert the list of tuples into list of delimited strings.
* Open the file in write mode using `w` (overwrite) or `a` (append).
* Add the data into the file.
* Ensure that the data in the file is validated.

In [1]:
orders = [(1, '2013-07-25 00:00:00.0', 11599, 'CLOSED'),
 (2, '2013-07-25 00:00:00.0', 256, 'PENDING_PAYMENT'),
 (3, '2013-07-25 00:00:00.0', 12111, 'COMPLETE'),
 (4, '2013-07-25 00:00:00.0', 8827, 'CLOSED'),
 (5, '2013-07-25 00:00:00.0', 11318, 'COMPLETE'),
 (6, '2013-07-25 00:00:00.0', 7130, 'COMPLETE'),
 (7, '2013-07-25 00:00:00.0', 4530, 'COMPLETE'),
 (8, '2013-07-25 00:00:00.0', 2911, 'PROCESSING'),
 (9, '2013-07-25 00:00:00.0', 5657, 'PENDING_PAYMENT'),
 (10, '2013-07-25 00:00:00.0', 5648, 'PENDING_PAYMENT')]

In [2]:
type(orders)

list

In [3]:
orders[0]

(1, '2013-07-25 00:00:00.0', 11599, 'CLOSED')

In [4]:
type(orders[0])

tuple

In [5]:
order = orders[0]

In [7]:
str.join?

[0;31mDocstring:[0m
S.join(iterable) -> str

Return a string which is the concatenation of the strings in the
iterable.  The separator between elements is S.
[0;31mType:[0m      method_descriptor


In [None]:
'hello'.join

In [8]:
','.join(order) # throws error as first and third elements are of type int

TypeError: sequence item 0: expected str instance, int found

In [9]:
[str(item) for item in order]

['1', '2013-07-25 00:00:00.0', '11599', 'CLOSED']

In [10]:
# Convering all the items in tuple to strings using list comprehension
','.join([str(item) for item in order]) 

'1,2013-07-25 00:00:00.0,11599,CLOSED'

In [12]:
list(map(lambda item: str(item), order))

['1', '2013-07-25 00:00:00.0', '11599', 'CLOSED']

In [13]:
# Convering all the items in tuple to strings using map function
','.join(map(lambda item: str(item), order))

'1,2013-07-25 00:00:00.0,11599,CLOSED'

In [14]:
orders

[(1, '2013-07-25 00:00:00.0', 11599, 'CLOSED'),
 (2, '2013-07-25 00:00:00.0', 256, 'PENDING_PAYMENT'),
 (3, '2013-07-25 00:00:00.0', 12111, 'COMPLETE'),
 (4, '2013-07-25 00:00:00.0', 8827, 'CLOSED'),
 (5, '2013-07-25 00:00:00.0', 11318, 'COMPLETE'),
 (6, '2013-07-25 00:00:00.0', 7130, 'COMPLETE'),
 (7, '2013-07-25 00:00:00.0', 4530, 'COMPLETE'),
 (8, '2013-07-25 00:00:00.0', 2911, 'PROCESSING'),
 (9, '2013-07-25 00:00:00.0', 5657, 'PENDING_PAYMENT'),
 (10, '2013-07-25 00:00:00.0', 5648, 'PENDING_PAYMENT')]

In [15]:
orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)

In [16]:
list(orders_csv)

['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '8,2013-07-25 00:00:00.0,2911,PROCESSING',
 '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT',
 '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT']

In [17]:
orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)
order = list(orders_csv)[0]
order

'1,2013-07-25 00:00:00.0,11599,CLOSED'

* Writing CSV strings one at a time to the file.

In [18]:
!rm -rf data/retail_db/orders

In [19]:
!mkdir -p data/retail_db/orders

In [20]:
orders_file = open('data/retail_db/orders/part-00000', 'w')

In [21]:
orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)

In [22]:
for order in orders_csv:
    orders_file.write(f'{order}\n')

In [23]:
orders_file.close()

* Writing as one big string. As we are opening the file using `w`, the file will be truncated. It means the contents of the file will be overwritten with the string we are trying to write to the file.

In [25]:
orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)

In [26]:
orders_string = '\n'.join(orders_csv)

In [27]:
orders_string

'1,2013-07-25 00:00:00.0,11599,CLOSED\n2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT\n3,2013-07-25 00:00:00.0,12111,COMPLETE\n4,2013-07-25 00:00:00.0,8827,CLOSED\n5,2013-07-25 00:00:00.0,11318,COMPLETE\n6,2013-07-25 00:00:00.0,7130,COMPLETE\n7,2013-07-25 00:00:00.0,4530,COMPLETE\n8,2013-07-25 00:00:00.0,2911,PROCESSING\n9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT\n10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT'

In [28]:
orders_file = open('data/retail_db/orders/part-00000', 'w')

In [29]:
orders_file.write(orders_string)

401

In [30]:
orders_file.close()