# Lecture 6 - User and file input

The code can have input from console (user) or from filesystem (file) and use it within the code.

Let's first see how to get input from user/console.

The function for getting input from user is `input()`

In [None]:
my_input = input("Write something: ")

In [None]:
print(my_input)

Let's ask user which number they want to check if it's prime or not.

In [None]:
import math
def is_prime(n):
    for i in range(2,int(math.sqrt(n))+1):
        if (n%i) == 0:
            return False
    return True

In [None]:
is_prime(7919)

In [None]:
f_input = input("Please enter number to check primeness:")
#type(f_input)
print(is_prime(f_input))

Uh oh! What just happened?

`input()` function assigns the value as string and our function uses integer value. Let's try again:

In [None]:
f_input = input("Please enter number to check primeness:")
print(is_prime(int(f_input)))

Being able to get input from user allows us dynamically computing results tailored for the user.

We can process list of numbers or do word count for a text input by user

### Example 1

Take squares of odd numbers in a list provided by user (comma separated)

In [None]:
values = input()

# numbers = values.split(",")
# for number in numbers:
#    if int(number) % 2 != 0:
#        print(int(number) ** 2)

# is equivalent to

numbers = [int(x)**2 for x in values.split(",") if int(x)%2!=0]
print(numbers)

# input: 1,2,9,8,7,6,5
# will print [1, 81, 49, 25]

### Example 2

Get a sentence from user and do word count.

In [None]:
freq = {}   # frequency of words in text
line = input("Please type a sentence: ")
for word in line.split():
    #word=word.lower()
    freq[word] = freq.get(word,0)+1
words = list(freq.keys())
for w in words:
    print("%s: %d" % (w,freq[w]))
# try the sentence: to be or not to be
# what about The is the article in the English language

### Example 3 Number guessing game

This game illustrates an interesting use of `while` loop. We continuously get input from the user and inform about the result.

In [None]:
import random

random_number = random.randrange(1, 10)

while True:
    guess = int(input("What could it be? > "))  # ask as long as answer is not correct
    if guess == random_number:
        print("CONGRATS YOU GOT IT")
        break
    elif guess > random_number:
        print("YOUR GUESS IS HIGH")
    elif guess < random_number:
        print("YOUR GUESS IS LOW")
    else:
        print("Try something else")

# While loop

If you know number of iterations then `for` loop can be used. However, if you don't know how many iterations are needed or if you want your loop to finish when a certain criteria is met then you should use `while` loop.

this is the template for `while` loop

```python
while test expression:
    # do smt
```

The important point here is that, **you need to increment or update any index/variable within the loop** so that eventually the test expression is *False*. 

For example, the loop below will run until i is smaller than 6.

In [None]:
i = 1
while i < 6:
    print(i)
    i += 1

<div class="alert alert-block alert-warning">
    <i class="fas fa-fw fa-question-circle mr-3 align-self-center"></i>
    <b>Question:</b> What happens if you forget to increase the value of i? <br>
</div>

## Better control of loops: `break` and `continue` 

In a `while` loop (or `for` loop as well) if you wish skip a certain iteration you can use `continue` statement and if you wish the terminate the iteration/loop process then you can use `break` statement.

In the example below, we go over numbers and then we skip (*i.e.* `continue`) even numbers and we terminate the loop (*i.e.* `break`) when we reach 10. **Notice that the actual loop itself is an infinite loop.**

In [None]:
num=0

while True:             # <---
    num += 1            #    |
                        #    |
    if num % 2 == 0:    #    |
        continue        # ----

    if num >= 10:    # what happens at == ?
        break
    print(num)


## Exercise

1. If you change the location of `num += 1` what would be the result, same or different?

```python
num=1

while True:
    if num % 2 == 0: 
        continue
    if num >= 10:
        break
    print(num)
    num += 1    ## <- carried here
```

2. The code below inputs a number from user and prints square of it. What will happen if user inputs a floating number to the input? How can we fix this problem?

```python
result = int(input("Write a number to be squared: ")) 
print(result**2)
```

3. Make necesary modifications at "Example 3 Number guessing game" so that it asks user "Do you want to continue?" and starts over if user inputs "Yes", quits if user selects "No".


# File input and output


## File input

The `open()` function is used to open files. When you open a file, you assign handle or placeholder for that filename to a variable which can be referred to when file related functions or methods needed. For example, to read a line from a file, you open the file, assign a variable with handle and then call read line method on that handle.

`f.read(size)` reads some quantity of data and returns it as a string where size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned.

In [None]:
f=open("data/jane-austen-emma.txt")
# file.read() function will print whole book
f.readline()

As noted above, `read()` method will retrieve the contents of the whole file at once. For large files, that might be problematic. Thus `readline()` method can be used to retrieve single line. We printed the first line of the file.

In [None]:
f.readline()

Did you notice that second call to `readline()` printed second line in the file? `readline()` retrieves the line that is next in line. The file object tracks a position in the file and `readline()` call will push the object position to next line in the file.

One way to access all the content of a file is to loop over the lines and that is the most memory efficient approach. Because, at each iteration, single line is retrieved and processed. In the example below, a large file is read line by line, instead of printing contents (and filling up the screen) we are counting number of lines.

In [None]:
# first approach
f=open("data/jane-austen-emma.txt")
count=0
for line in f:
    # print line
    count += 1
print(count)
f.close()

Second approach is to use `readlines()` method which will retrieve each line from the file and assign it to a list. The advantage is that, you can access any line via index (even after file is closed). However, there's a disadvantage in this approach, if the file is big, then the list will take up too much space in the memory.

In [None]:
# second approach
f=open("data/jane-austen-emma.txt")
lines = f.readlines()
print(len(lines))
f.close()

When you're done with a file, call `f.close()` to close it and free up any system resources taken up by the open file.

As mentioned above, with `readlines()` you can keep the contents of a file in a list which is accessible via index (or slice) even after the file is closed.

In [None]:
lines[0:9]

In [None]:
lines[-10:]

Let's do a word count using the list of lines. In order to get more accurate results we should count lowercase of words.

In [None]:
freq = {}
for line in lines: 
    for word in line.split():
        word = word.lower()
        freq[word] = freq.get(word,0)+1

Let's view 10 words and their counts

In [None]:
words = list(freq.keys())[0:9]
for w in words:
    print("%s: %d" % (w,freq[w]))

### `line` versus `lines`

There's a big difference between 
* lists
* generators/iterators

In lists, there's direct access to elements with indexes. However this comes with a cost. The list will take up space in memory.

In generators, usually there's no direct access, the elements are generated on demand. Thus, they don't use memory space to store all elements.

Let's see the size of `lines` list in memory


In [None]:
import sys
sys.getsizeof(lines)

We can count without using any memory by `line` generator:

In [None]:

f=open("data/jane-austen-emma.txt")
another_dict = {}
for line in f: 
    for word in line.split():
        another_dict[word.lower()] = another_dict.get(word.lower(),0)+1
f.close()

list(another_dict.items())[0:9]


How many lines, words and unique words are there?

In [None]:
uniq_words = len(freq.keys())
total_words = sum([len(line.split()) for line in lines])


In [None]:
template= "In the novel, there are %d lines \
and total of %d words are used. Number of \
unique words is %d"

print(template % (len(lines),total_words,uniq_words))

## Summary

* `f.read()` will read the whole file contents to single variable
* `f.readline()` will read the next line each time it's executed (*memory efficient, since only single line is processed at a given time*)
  * alternative is `for line in f:`
* `f.readlines()` will read **all** lines into an array

## Output to files

Open file for writing. 

> Be aware, `w` mode will overwrite existing file!

In [None]:
f = open('data/test.txt', 'w')

A file named `test.txt` has been opened under `data` folder. `f` is the file object. There are various ways to access and write to file.

In [None]:
f.write("Hello world")

Let's check and see the contents of the file.

Why is it empty?

In [None]:
f.close()

> We discussed this last week. The contents to files are not written/saved imediately to file on disk.

You can also write to a file by `print` function, with `file=` argument within.

In [None]:
f = open('data/test2.txt', 'w')
print("Second hello..", file=f)
print("to the screen")
print("to the file", file=f)
f.close()

In [None]:
f = open('data/test2.txt', 'a')
print("This is appended to file..", file=f)
f.close()

Now, more serious example:

In [None]:
# source: https://scipython.com/book/chapter-2-the-core-python-language-i/examples/writing-numbers-to-a-file/
f = open('data/powers.txt', 'w')
for i in range(1,1001):
    print(i, i**2, i**3, i**4, sep=', ', file=f)
f.close()

Other modes for reading/writing files:

* **w** : Write mode. If file does not exist, it creates a new file. *But*, if file exists it truncates the file.
* **a** : Append mode, add lines to file (If file does not exist, it creates a new file)
* **x** : Creates a new file. If file already exists, the operation fails.
* **r** : Read mode

Let's read the cubes data from the file. We'll be collecting data from 3rd column.

In [None]:
f = open('data/powers.txt', 'r')
cubes= []
for line in f.readlines():
    fields = line.split(',')
    cubes.append(int(fields[2]))
f.close()
n = 5
print(n, 'cubed is', cubes[n-1])

In [None]:
len(cubes)

In [None]:
cubes[1:5]