## Interacting with the OS and filesystem

The `os` module in Python provides many functions for interacting with the OS and the filesystem. Let's import it and try out some examples.

In [1]:
import os

In [2]:
os.getcwd()

'C:\\Users\\shalini'

In [3]:
os.listdir()

['.bash_history',
 '.conda',
 '.condarc',
 '.continuum',
 '.gitconfig',
 '.idlerc',
 '.ipynb_checkpoints',
 '.ipython',
 '.jupyter',
 '.lesshst',
 '.matplotlib',
 '.ssh',
 '3D Objects',
 'anaconda3',
 'AppData',
 'Application Data',
 'Contacts',
 'Cookies',
 'Desktop',
 'Documents',
 'Downloads',
 'Favorites',
 'IntelGraphicsProfiles',
 'Links',
 'Local Settings',
 'movies2.csv',
 'movies3.csv',
 'Music',
 'My Documents',
 'NetHood',
 'NTUSER.DAT',
 'ntuser.dat.LOG1',
 'ntuser.dat.LOG2',
 'NTUSER.DAT{8dfbc6f5-d4a3-11ea-a8e5-fc4596a70159}.TM.blf',
 'NTUSER.DAT{8dfbc6f5-d4a3-11ea-a8e5-fc4596a70159}.TMContainer00000000000000000001.regtrans-ms',
 'NTUSER.DAT{8dfbc6f5-d4a3-11ea-a8e5-fc4596a70159}.TMContainer00000000000000000002.regtrans-ms',
 'ntuser.ini',
 'numpy',
 'OneDrive',
 'Pictures',
 'PrintHood',
 'PycharmProjects',
 'Recent',
 'Saved Games',
 'Searches',
 'SendTo',
 'Sources',
 'Start Menu',
 'Templates',
 'test',
 'Untitled.ipynb',
 'Untitled1.ipynb',
 'Untitled10.ipynb',
 'Untit

In [4]:
os.listdir('.') # relative path

['.bash_history',
 '.conda',
 '.condarc',
 '.continuum',
 '.gitconfig',
 '.idlerc',
 '.ipynb_checkpoints',
 '.ipython',
 '.jupyter',
 '.lesshst',
 '.matplotlib',
 '.ssh',
 '3D Objects',
 'anaconda3',
 'AppData',
 'Application Data',
 'Contacts',
 'Cookies',
 'Desktop',
 'Documents',
 'Downloads',
 'Favorites',
 'IntelGraphicsProfiles',
 'Links',
 'Local Settings',
 'movies2.csv',
 'movies3.csv',
 'Music',
 'My Documents',
 'NetHood',
 'NTUSER.DAT',
 'ntuser.dat.LOG1',
 'ntuser.dat.LOG2',
 'NTUSER.DAT{8dfbc6f5-d4a3-11ea-a8e5-fc4596a70159}.TM.blf',
 'NTUSER.DAT{8dfbc6f5-d4a3-11ea-a8e5-fc4596a70159}.TMContainer00000000000000000001.regtrans-ms',
 'NTUSER.DAT{8dfbc6f5-d4a3-11ea-a8e5-fc4596a70159}.TMContainer00000000000000000002.regtrans-ms',
 'ntuser.ini',
 'numpy',
 'OneDrive',
 'Pictures',
 'PrintHood',
 'PycharmProjects',
 'Recent',
 'Saved Games',
 'Searches',
 'SendTo',
 'Sources',
 'Start Menu',
 'Templates',
 'test',
 'Untitled.ipynb',
 'Untitled1.ipynb',
 'Untitled10.ipynb',
 'Untit

In [5]:
os.listdir('/Users') # absolute path

['All Users',
 'Default',
 'Default User',
 'defaultuser0',
 'desktop.ini',
 'Public',
 'shalini']

You can create a new directory using `os.makedirs`. Let's create a new directory called `numpy`, where we'll later download some files.

In [6]:
os.makedirs('./numpy', exist_ok=True)

In [7]:
'numpy' in os.listdir('.')

True

In [8]:
help(os.makedirs)

Help on function makedirs in module os:

makedirs(name, mode=511, exist_ok=False)
    makedirs(name [, mode=0o777][, exist_ok=False])
    
    Super-mkdir; create a leaf directory and all intermediate ones.  Works like
    mkdir, except that any intermediate path segment (not just the rightmost)
    will be created if it does not exist. If the target directory already
    exists, raise an OSError if exist_ok is False. Otherwise no exception is
    raised.  This is recursive.



In [9]:
os.listdir('./numpy')

['loans1',
 'loans1.csv',
 'loans1.txt',
 'loans2.txt',
 'loans3.txt',
 'loansnew1.txt',
 'loansnew2.txt',
 'loansnew3.txt',
 'movie.csv']

In [10]:
# download file into numpy
url1 = 'https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans1.txt'
url2 = 'https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans2.txt'
url3 = 'https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans3.txt'

In [11]:
from urllib.request import urlretrieve

In [12]:
urlretrieve(url1, './numpy/loans1.txt')
urlretrieve(url2, './numpy/loans2.txt')
urlretrieve(url3, './numpy/loans3.txt')

('./numpy/loans3.txt', <http.client.HTTPMessage at 0x253491a5880>)

In [13]:
os.listdir('./numpy')

['loans1',
 'loans1.csv',
 'loans1.txt',
 'loans2.txt',
 'loans3.txt',
 'loansnew1.txt',
 'loansnew2.txt',
 'loansnew3.txt',
 'movie.csv']

In [14]:
file1=open('./numpy/loans1.txt' , mode='r')

The `open` function also accepts a `mode` argument to specifies how we can interact with the file. The following options are supported:

```
    ========= ===============================================================
    Character Meaning
    --------- ---------------------------------------------------------------
    'r'       open for reading (default)
    'w'       open for writing, truncating the file first
    'x'       create a new file and open it for writing
    'a'       open for writing, appending to the end of the file if it exists
    'b'       binary mode
    't'       text mode (default)
    '+'       open a disk file for updating (reading and writing)
    'U'       universal newline mode (deprecated)
    ========= ===============================================================
```

To view the contents of the file, we can use the `read` method of the file object.

In [15]:
file1_content=file1.read()

In [16]:
file1_content

'amount,duration,rate,down_payment\n100000,36,0.08,20000\n200000,12,0.1,\n628400,120,0.12,100000\n4637400,240,0.06,\n42900,90,0.07,8900\n916000,16,0.13,\n45230,48,0.08,4300\n991360,99,0.08,\n423000,27,0.09,47200'

In [17]:
print(file1_content)

amount,duration,rate,down_payment
100000,36,0.08,20000
200000,12,0.1,
628400,120,0.12,100000
4637400,240,0.06,
42900,90,0.07,8900
916000,16,0.13,
45230,48,0.08,4300
991360,99,0.08,
423000,27,0.09,47200


In [18]:
# let us close file otherwise, python will hold fie in ram
file1.close()


In [19]:
file1.read()

ValueError: I/O operation on closed file.

![image.png](attachment:image.png)

## Closing files automatically using `with`

To close a file automatically after you've processed it, you can open it using the `with` statement.

In [None]:
with open('./numpy/loans2.txt' , mode='r') as file2 :
     file2_content=file2.read() 
     print(file2_content)

In [None]:
file2.read()

![image.png](attachment:image.png)

## Reading a file line by line


File objects provide a `readlines` method to read a file line-by-line. 

In [20]:
with open('./numpy/loans3.txt' , mode='r') as file3 :
     file3_content=file3.readlines()
     print(file3_content)

['amount,duration,rate,down_payment\n', '45230,48,0.07,4300\n', '883000,16,0.14,\n', '100000,12,0.1,\n', '728400,120,0.12,100000\n', '3637400,240,0.06,\n', '82900,90,0.07,8900\n', '316000,16,0.13,\n', '15230,48,0.08,4300\n', '991360,99,0.08,\n', '323000,27,0.09,4720010000,36,0.08,20000\n', '528400,120,0.11,100000\n', '8633400,240,0.06,\n', '12900,90,0.08,8900']


## Processing data from files

Before performing any operations on the data stored in a file, we need to convert the file's contents from one large string into Python data types. For the file `loans1.txt` containing information about loans in a CSV format, we can do the following:

* Read the file line by line
* Parse the first line to get a list of the column names or headers
* Split each remaining line and convert each value into a float
* Create a dictionary for each loan using the headers as keys
* Create a list of dictionaries to keep track of all the loans

Since we will perform the same operations for multiple files, it would be useful to define a function `read_csv`. We'll also define some helper functions to build up the functionality step by step. 

Let's start by defining a function `parse_header` that takes a line as input and returns a list of column headers.

In [21]:
def parse_header(header_line):
    return header_line.strip().split(',')

The `strip` method removes any extra spaces and the newline character `\n`. The `split` method breaks a string into a list using the given separator (`,` in this case).

In [22]:
file3_content[0]

'amount,duration,rate,down_payment\n'

In [23]:
header=parse_header(file3_content[0])

In [24]:
header


['amount', 'duration', 'rate', 'down_payment']

Next, let's define a function `parse_values` that takes a line containing some data and returns a list of floating-point numbers.

In [25]:
def pharse_line(data_line) :
    value= []
    for items in data_line.strip().split(',') :
        value.append(float(items))
        
    return value
    

In [26]:
print (pharse_line(file3_content[1]))

[45230.0, 48.0, 0.07, 4300.0]


In [27]:
file3_content[2]

'883000,16,0.14,\n'

In [28]:
print (pharse_line(file3_content[2]))

ValueError: could not convert string to float: ''

![image.png](attachment:image.png)


Empty string cannot convert to float

In [229]:
def pharse_line(data_line) :
    value= []
    for items in data_line.strip().split(',') :
        if items== ' ' :
             value.append(0.0)
        else :
            try :
                value.append(float(items))
            except ValueError :
                value.append(items)   
    return value

In [230]:
print (pharse_line(file3_content[2]))

[883000.0, 16.0, 0.14, '']


Next, let's define a function `create_item_dict` that takes a list of values and a list of headers as inputs and returns a dictionary with the values associated with their respective headers as keys.

In [231]:
def create_item_dict(values, headers) :
    result={}
    for values ,header in zip( values, headers ) :
        result[header] =  values
    return result

Can you figure out what the Python built-in function `zip` does? Try out an example, or [read the documentation](https://docs.python.org/3.3/library/functions.html#zip).

In [232]:
for item in zip([1,2,3], ['a', 'b', 'c'], ['e','f','g']):
    print(item)

(1, 'a', 'e')
(2, 'b', 'f')
(3, 'c', 'g')


In [233]:
file3_content[1]

'45230,48,0.07,4300\n'

In [234]:
values1=pharse_line(file3_content[1])
header=parse_header(file3_content[0])
create_item_dict(values1, header)

{'amount': 45230.0, 'duration': 48.0, 'rate': 0.07, 'down_payment': 4300.0}

As expected, the values & header are combined to create a dictionary with the appropriate key-value pairs.

We are now ready to put it all together and define the `read_csv` function.

In [235]:
def read_csv(path) :
    result= []
    with open(path, 'r') as fileread :
        fileread_content=fileread.readlines()
        header= header=parse_header(file3_content[0])
        for data_line in fileread_content[1:] :
            values1=pharse_line(data_line)
            item_dict = create_item_dict(values1, header)
            result.append(item_dict)
            
    return result
        

In [236]:
read_csv('./numpy/loans1.csv')

[{'amount': 100000.0, 'duration': 36.0, 'rate': 0.08, 'down_payment': 20000.0},
 {'amount': 200000.0, 'duration': 12.0, 'rate': 0.1, 'down_payment': ''},
 {'amount': 628400.0,
  'duration': 120.0,
  'rate': 0.12,
  'down_payment': 100000.0},
 {'amount': 4637400.0, 'duration': 240.0, 'rate': 0.06, 'down_payment': ''},
 {'amount': 42900.0, 'duration': 90.0, 'rate': 0.07, 'down_payment': 8900.0},
 {'amount': 916000.0, 'duration': 16.0, 'rate': 0.13, 'down_payment': ''},
 {'amount': 45230.0, 'duration': 48.0, 'rate': 0.08, 'down_payment': 4300.0},
 {'amount': 991360.0, 'duration': 99.0, 'rate': 0.08, 'down_payment': ''},
 {'amount': 423000.0, 'duration': 27.0, 'rate': 0.09, 'down_payment': 47200.0}]

### let see the full code

In [237]:
def parse_header(headers) :
    return headers.strip().split(',')

def pharse_line(content) :
    values= []
    for value in content.strip().split(',') :
        if value == '' :
            values.append(0.0)
        else :
            try :
                values.append(float(value))
            except  ValueError :
                values.append(value)
    return values

def create_item_dict(value, header) :
    result = {}
    for values , header in zip(value , header) :
        result[header] = values
    return result

def read_csv(path) :
    result = []
    with open(path, 'r') as fileread :
        fileread_content=fileread.readlines()
        header = parse_header(file3_content[0])
        for data_line in fileread_content[1:] :
            values1=pharse_line(data_line)
            item_dict = create_item_dict(values1, header)
            result.append(item_dict)
            
    return result

read_csv('./numpy/loansnew1.txt')

[{'amount': 100000.0, 'duration': 36.0, 'rate': 0.08, 'down_payment': 20000.0},
 {'amount': 200000.0, 'duration': 12.0, 'rate': 0.1, 'down_payment': 0.0},
 {'amount': 628400.0,
  'duration': 120.0,
  'rate': 0.12,
  'down_payment': 100000.0},
 {'amount': 4637400.0, 'duration': 240.0, 'rate': 0.06, 'down_payment': 0.0},
 {'amount': 42900.0, 'duration': 90.0, 'rate': 0.07, 'down_payment': 8900.0},
 {'amount': 916000.0, 'duration': 16.0, 'rate': 0.13, 'down_payment': 0.0},
 {'amount': 45230.0, 'duration': 48.0, 'rate': 0.08, 'down_payment': 4300.0},
 {'amount': 991360.0, 'duration': 99.0, 'rate': 0.08, 'down_payment': 0.0},
 {'amount': 423000.0, 'duration': 27.0, 'rate': 0.09, 'down_payment': 47200.0}]

In [238]:
import math

def loan_emi(amount, duration, rate, down_payment=0):
    """Calculates the equal montly installment (EMI) for a loan.
    
    Arguments:
        amount - Total amount to be spent (loan + down payment)
        duration - Duration of the loan (in months)
        rate - Rate of interest (monthly)
        down_payment (optional) - Optional intial payment (deducted from amount)
    """
    loan_amount = amount - down_payment
    try:
        emi = loan_amount * rate * ((1+rate)**duration) / (((1+rate)**duration)-1)
    except ZeroDivisionError :
        if duration == 0.0 :
            emi =0.0
        else :
            emi = loan_amount / duration
  
    emi = math.ceil(emi)
    return emi


In [239]:
loans = read_csv('./numpy/loans1.csv')
print(loans)

    

[{'amount': 100000.0, 'duration': 36.0, 'rate': 0.08, 'down_payment': 20000.0}, {'amount': 200000.0, 'duration': 12.0, 'rate': 0.1, 'down_payment': 0.0}, {'amount': 628400.0, 'duration': 120.0, 'rate': 0.12, 'down_payment': 100000.0}, {'amount': 4637400.0, 'duration': 240.0, 'rate': 0.06, 'down_payment': 0.0}, {'amount': 42900.0, 'duration': 90.0, 'rate': 0.07, 'down_payment': 8900.0}, {'amount': 916000.0, 'duration': 16.0, 'rate': 0.13, 'down_payment': 0.0}, {'amount': 45230.0, 'duration': 48.0, 'rate': 0.08, 'down_payment': 4300.0}, {'amount': 991360.0, 'duration': 99.0, 'rate': 0.08, 'down_payment': 0.0}, {'amount': 423000.0, 'duration': 27.0, 'rate': 0.09, 'down_payment': 47200.0}]


In [240]:
for loan in loans :
    loan['emi'] = loan_emi(loan['amount'], loan['duration'], loan['rate']/12, loan['down_payment'])
print(loans)

[{'amount': 100000.0, 'duration': 36.0, 'rate': 0.08, 'down_payment': 20000.0, 'emi': 2507}, {'amount': 200000.0, 'duration': 12.0, 'rate': 0.1, 'down_payment': 0.0, 'emi': 17584}, {'amount': 628400.0, 'duration': 120.0, 'rate': 0.12, 'down_payment': 100000.0, 'emi': 7582}, {'amount': 4637400.0, 'duration': 240.0, 'rate': 0.06, 'down_payment': 0.0, 'emi': 33224}, {'amount': 42900.0, 'duration': 90.0, 'rate': 0.07, 'down_payment': 8900.0, 'emi': 487}, {'amount': 916000.0, 'duration': 16.0, 'rate': 0.13, 'down_payment': 0.0, 'emi': 62664}, {'amount': 45230.0, 'duration': 48.0, 'rate': 0.08, 'down_payment': 4300.0, 'emi': 1000}, {'amount': 991360.0, 'duration': 99.0, 'rate': 0.08, 'down_payment': 0.0, 'emi': 13712}, {'amount': 423000.0, 'duration': 27.0, 'rate': 0.09, 'down_payment': 47200.0, 'emi': 15428}]


In [241]:
def compute_emis(loans):
    for loan in loans:
        loan['emi'] = loan_emi(
            loan['amount'], 
            loan['duration'], 
            loan['rate']/12, # the CSV contains yearly rates
            loan['down_payment'])
    
        

In [242]:
with open('./numpy/loans1.txt' , 'w') as loan1 :
   
    for loan in loans :
        loan1.write('{},{},{},{},{}\n' .format(loan['amount'], 
            loan['duration'], 
            loan['rate']/12, # the CSV contains yearly rates
            loan['down_payment'], loan['emi']))

In [243]:
import os

os.listdir('numpy')

['loans1',
 'loans1.csv',
 'loans1.txt',
 'loans2.txt',
 'loans3.txt',
 'loansnew1.txt',
 'loansnew2.txt',
 'loansnew3.txt',
 'movie.csv']

In [244]:
with open('./numpy/loans1' , 'r') as  file :
    file_con=file.read()
    print(file_con)

1.0,0.0,0.0,0.0,0
2.0,0.0,0.0,0.0,0
6.0,2.0,0.6666666666666666,4.0,3
4.0,6.0,0.25,7.0,-1
4.0,2.0,0.75,0.0,5
9.0,1.0,0.5,0.0,14
4.0,5.0,0.16666666666666666,3.0,1
9.0,9.0,0.08333333333333333,3.0,1
4.0,2.0,0.25,0.0,3



In [245]:
def write_csv(items, path) :
    with open(path , 'w') as  file :
        if len(items) == 0:
            return
        headers=list(items[0].keys())
        file.write(','.join(headers) + '\n')
        for item in items :
            value = []
            for header in headers :
                value.append(str(item.get(header,"")))
            file.write(', '.join(value) + '\n')

In [246]:
loans=read_csv('./numpy/loans2.txt')
compute_emis(loans)
write_csv(loans , './numpy/loans3.txt')

## Using Pandas to Read and Write CSVs

There are some limitations to the `read_csv` and `write_csv` functions we've defined above:

* The `read_csv` function fails to create a proper dictionary if any of the values in the CSV files contains commas
* The `write_csv` function fails to create a proper CSV if any of the values to be written contains commas

When a value in a CSV file contains a comma (`,`), the value is generally placed within double quotes. Double quotes (`"`) in values are converted into two double quotes (`""`). Here's an example:

```
title,description
Fast & Furious,"A movie, a race, a franchise"
The Dark Knight,"Gotham, the ""Batman"", and the Joker"
Memento,A guy forgets everything every 15 minutes

```

Let's try it out.

In [247]:
movies_url = "https://gist.githubusercontent.com/aakashns/afee0a407d44bbc02321993548021af9/raw/6d7473f0ac4c54aca65fc4b06ed831b8a4840190/movies.csv"

In [248]:
urlretrieve(movies_url , './numpy/movie.csv')

('./numpy/movie.csv', <http.client.HTTPMessage at 0x253481ca460>)

In [249]:
!pip install pandas --upgrade --quiet

In [250]:
import pandas as pd

In [251]:
movies_data_frames=pd.read_csv('./numpy/movie.csv')

In [252]:
movies_data_frames

Unnamed: 0,title,description
0,Fast & Furious,"A movie, a race, a franchise"
1,The Dark Knight,"Gotham, the ""Batman"", and the Joker"
2,Memento,A guy forgets everything every 15 minutes


In [253]:
movies=movies_data_frames.to_dict('records')

In [254]:
movies

[{'title': 'Fast & Furious', 'description': 'A movie, a race, a franchise'},
 {'title': 'The Dark Knight',
  'description': 'Gotham, the "Batman", and the Joker'},
 {'title': 'Memento',
  'description': 'A guy forgets everything every 15 minutes'}]

In [255]:
movies_dict = movies_data_frames.to_dict()

In [256]:
movies_dict

{'title': {0: 'Fast & Furious', 1: 'The Dark Knight', 2: 'Memento'},
 'description': {0: 'A movie, a race, a franchise',
  1: 'Gotham, the "Batman", and the Joker',
  2: 'A guy forgets everything every 15 minutes'}}

In [257]:
write_csv(movies, 'movies2.csv')

In [258]:
os.listdir('./numpy')

['loans1',
 'loans1.csv',
 'loans1.txt',
 'loans2.txt',
 'loans3.txt',
 'loansnew1.txt',
 'loansnew2.txt',
 'loansnew3.txt',
 'movie.csv']

In [259]:
!head movies2.csv

'head' is not recognized as an internal or external command,
operable program or batch file.


In [260]:
pd.read_csv('movies2.csv')

Unnamed: 0,Unnamed: 1,title,description
Fast & Furious,A movie,a race,a franchise
The Dark Knight,Gotham,"the ""Batman""",and the Joker
Memento,A guy forgets everything every 15 minutes,,


In [261]:
df2= pd.DataFrame(movies)

In [262]:
!head movies2.csv

'head' is not recognized as an internal or external command,
operable program or batch file.


In [263]:
df2.to_csv('movies3.csv', index=None) 

In [264]:
pd.read_csv('movies3.csv')

Unnamed: 0,title,description
0,Fast & Furious,"A movie, a race, a franchise"
1,The Dark Knight,"Gotham, the ""Batman"", and the Joker"
2,Memento,A guy forgets everything every 15 minutes


## Exercise - Processing CSV files using a dictionary of lists

We defined the functions `read_csv` and `write_csv` above to convert a CSV file into a list of dictionaries and vice versa. In this exercise, you'll transform the CSV data into a dictionary of lists instead, with one list for each column in the file.

For example, consider the following CSV file:

```
amount,duration,rate,down_payment
828400,120,0.11,100000
4633400,240,0.06,
42900,90,0.08,8900
983000,16,0.14,
15230,48,0.07,4300
```

We'll convert it into the following dictionary of lists:

```
{
  amount: [828400, 4633400, 42900, 983000, 15230],
  duration: []120, 240, 90, 16, 48],
  rate: [0.11, 0.06, 0.08, 0.14, 0.07],
  down_payment: [100000, 0, 8900, 0, 4300]
}
```

Complete the following tasks using the empty cells below:

1. Download three CSV files to the folder `data2` using the URLs listed in the code cell below, and verify the downloaded files.
2. Define a function `read_csv_columnar` that reads a CSV file and returns a dictionary of lists in the format shown above. 
3. Define a function `compute_emis` that adds another key `emi` into the dictionary with a list of EMIs computed for each row of data.
4. Define a function `write_csv_columnar` that writes the data from the dictionary of lists into a correctly formatted CSV file.
5. Process all three downloaded files and write the results by creating new files in the directory `data2`.

Define helper functions wherever required.

In [477]:
url1 = 'https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans1.txt'
url2 = 'https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans2.txt'
url3 = 'https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans3.txt'

In [478]:
from urllib.request import urlretrieve

In [479]:
import os
os.makedirs('./data2' , exist_ok=True )

In [480]:
urlretrieve(url1, './data2/loansnew1.txt')
urlretrieve(url2, './data2/loansnew2.txt')
urlretrieve(url3, './data2/loansnew3.txt')

('./data2/loansnew3.txt', <http.client.HTTPMessage at 0x2534d369610>)

In [481]:
with open('./data2/loansnew1.txt', 'r') as fileread :
        fileread_content=fileread.readlines()
        print(fileread_content)

['amount,duration,rate,down_payment\n', '100000,36,0.08,20000\n', '200000,12,0.1,\n', '628400,120,0.12,100000\n', '4637400,240,0.06,\n', '42900,90,0.07,8900\n', '916000,16,0.13,\n', '45230,48,0.08,4300\n', '991360,99,0.08,\n', '423000,27,0.09,47200']


In [498]:
def pharse_header(items) :
    return items.strip().split(',')

def parse_values(data_line):
    values = []
    for item in data_line.strip().split(','):
        if item == '':
            values.append(0.0)
        else:
            try:
                values.append(float(item))
            except ValueError:
                values.append(item)
    return values

def create_item_dict(value, header) :
    result = {}
    for  header , values in zip( header , value) :
        result[header] = values
    return result

import math

def loan_emi(amount, duration, rate, down_payment=0):
    """Calculates the equal montly installment (EMI) for a loan.
    
    Arguments:
        amount - Total amount to be spent (loan + down payment)
        duration - Duration of the loan (in months)
        rate - Rate of interest (monthly)
        down_payment (optional) - Optional intial payment (deducted from amount)
    """
    loan_amount = amount - down_payment
    try:
        emi = loan_amount * rate * ((1+rate)**duration) / (((1+rate)**duration)-1)
    except ZeroDivisionError :
        if duration == 0.0 :
            emi =0.0
        else :
            emi = loan_amount / duration
  
    emi = math.ceil(emi)
    return emi

def compute_emis(loans):
        return loan_emi(
            loans[0], 
            loans[1], 
            loans[2]/12, # the CSV contains yearly rates
            loans[3])
        
def read_csv_columnar(path):
    result = {}
    item_dict = []
    values_o = []
    with open(path, 'r') as f: 
        lines = f.readlines()  
        headers = parse_header(lines[0])
        headers.append('emi')
        for data_line in lines[1:]:  
            values = parse_values(data_line)
            values.append(compute_emis(values))
            item_dict.append(values) 
            value_o=list(map(list, zip(*item_dict)))
        result=create_item_dict(value_o, headers)      
        
    return  result

def write_csv_columnar(items, path) :
    with open(path , 'w') as  file :
        if len(items) == 0:
            return
        headers=list(items.keys())
        file.write('  ,  '.join(headers) + '\n')
        items1=list(items.values())
        items1=list(map(list, zip(*items1)))       
        for item in items1 :
            value = []
            for item_1 in item :
                value.append(str( item_1))
            file.write('  ,   '.join(value) + '\n')

In [499]:
loans=read_csv_columnar('./data2/loansnew1.txt')


In [500]:
write_csv(loans, './data2/emis1.csv')

In [501]:
loans=read_csv_columnar('./data2/loansnew2.txt')
write_csv(loans, './data2/emis2.csv')

In [502]:
loans=read_csv_columnar('./data2/loansnew3.txt')
write_csv(loans, './data2/emis3.csv')

In [503]:
pd.read_csv('./data2/emis3.csv')

Unnamed: 0,amount,duration,rate,down_payment,emi
0,45230.0,48.0,0.07,4300.0,981.0
1,883000.0,16.0,0.14,0.0,60819.0
2,100000.0,12.0,0.1,0.0,8792.0
3,728400.0,120.0,0.12,100000.0,9016.0
4,3637400.0,240.0,0.06,0.0,26060.0
5,82900.0,90.0,0.07,8900.0,1060.0
6,316000.0,16.0,0.13,0.0,21618.0
7,15230.0,48.0,0.08,4300.0,267.0
8,991360.0,99.0,0.08,0.0,13712.0
9,323000.0,27.0,0.09,4720010000.0,36.0


In [506]:

df2= pd.DataFrame(loans)
!head movies2.csv
df2.to_csv('./data2/emis4.csv')

'head' is not recognized as an internal or external command,
operable program or batch file.


In [507]:
pd.read_csv('./data2/emis4.csv')

Unnamed: 0.1,Unnamed: 0,amount,duration,rate,down_payment,emi
0,0,45230.0,48.0,0.07,4300.0,981.0
1,1,883000.0,16.0,0.14,0.0,60819.0
2,2,100000.0,12.0,0.1,0.0,8792.0
3,3,728400.0,120.0,0.12,100000.0,9016.0
4,4,3637400.0,240.0,0.06,0.0,26060.0
5,5,82900.0,90.0,0.07,8900.0,1060.0
6,6,316000.0,16.0,0.13,0.0,21618.0
7,7,15230.0,48.0,0.08,4300.0,267.0
8,8,991360.0,99.0,0.08,0.0,13712.0
9,9,323000.0,27.0,0.09,4720010000.0,36.0
