## Before continuing, please select menu option:  **Cell => All output => clear**

# Open repo PythonTutButty03 and go through the CLI examples

 - CLIargs1.py - Parameter argv in the sys module.
 - CLIargs2.py - Program structure separating CLI parsing.
 - CLIargs3.py - separating options and arguments.
 - CLIskeleton.py - Uses argparse, a helpful python built-in module.

Take a look at clickskeleton.py

# Files & I/O

In [None]:
import os
os.path.exists('testdata')

In [None]:
targetdir = 'testdata'
if os.path.exists(targetdir):
    print(os.listdir(targetdir))

### There are many pathname manipulation functions:
 - https://docs.python.org/3/library/os.path.html
  - Common ones are abspath, basename, dirname, exists, join, split, splitext

In [None]:
fullname = os.path.abspath(r'./testdata')
fullname

In [None]:
os.path.basename(fullname)

In [None]:
os.path.dirname(fullname)

In [None]:
makeafullpath = os.path.join('one', 'two', 'three')
makeafullpath

#### Best ways to list directories

In [None]:
entries = os.listdir('testdata') # old method not recommended (what if directory was huge)
entries

In [None]:
entries = os.scandir('testdata') # the preferred method
entries

In [None]:
for i in os.scandir('testdata'):
    print(i.name)

In [None]:
from pathlib import Path
entries = Path('testdata')   # This is now the preferred method with most functionality
for entry in entries.iterdir():
    print(entry.name)

In [None]:
# UNIX translate name patterns with wildcards like ? and * into a list of files. This is called globbing
import glob
glob.glob('testdata\*.txt')

In [None]:
p = Path('.')
list(p.glob('testdata\*.csv'))

In [None]:
# Walking a directory tree and printing the names of the directories and files
count = 0
for dirpath, dirnames, files in os.walk('.'):
    count += 1
count

In [None]:
# For Ref: Using a temporary file
from tempfile import TemporaryFile
fp = TemporaryFile('w+t')
fp.write('Hello universe!')
fp.name

In [None]:
fp.close()
os.remove(fp.name)  # fails because the file was automatically removed upon closing

In [None]:
# For Ref: opening a zipfile
import os
import zipfile

zfname = r'testdata\allconf.zip'

if os.path.exists(zfname):
    with zipfile.ZipFile(zfname, 'r') as zipobj:
        for i in zipobj.namelist():
            print(i)

## Reading & Writing files

<br>Modes are:
- 'r'	open for reading (default)
- 'w'	open for writing, truncating the file first
- 'x'	open for exclusive creation, failing if the file already exists
- 'a'	open for writing, appending to the end of the file if it exists
- 'b'	binary mode
- 't'	text mode (default)
- '+'	open a disk file for updating (reading and writing)
- 'U'	universal newlines mode (deprecated)

Difference between binary/text:
- Files opened in binary mode return contents as bytes objects without unicode decoding. 
- In text mode the contents of the file are returned as str, After the bytes are decoded using a platform-dependent or specified encoding.

#### File methods:
 - `.read(size=-1)` Reads an entire file or up to *size* number of bytes.
 - `.readline(size=-1)` Reads the next line or up to *size* characters from the next line.
 - `.readlines=()` Reads the remaining lines from the file as a list (including '\n').
 - `.write(string or bytes)` Writes to the file.
 - `.writelines(seq)` Writes the sequence to the file (note that line endings '\n' are not appended).
 
 Note that the `print` statement also accepts a file object which can be an open file:
     ```
     print(*args, file=sys.stdout)
     ```
 

In [None]:
myfn = 'mytest.txt'
fd = open(myfn, 'wb') # Note that mode of 'w' will open for writing and truncate the file first

In [None]:
!dir my*

In [None]:
# Get the file mode used
print(fd.mode)

In [None]:
# Get the files name
print(fd.name)

In [None]:
# Write text to a file with a newline
fd.write(bytes("Write me to the file\n", 'UTF-8'))

In [None]:
# Close the file
fd.close()

In [None]:
!type mytest.txt

In [None]:
# Opens a file for reading and writing
fd = open(myfn, "rb+")  # the + indicates update mode, you can also write to it (no truncation)
# Read text from the file
text = fd.read()
print(type(text))
print(text)

In [None]:
# Implicitly closed before re-opening:
fd = open(myfn, "r+")  # Note by default this is opened as text
fd.seek(6)
fd.write('XX')
fd.seek(0)
# Read text from the file
text = fd.read()
print(type(text))
print(text)

In [None]:
# Close the file
fd.close()

In [None]:
# using the with statement context manager:
with open(myfn, "r+") as fd:
    # Read text from the file
    text = fd.read()
    print(type(text))
    print(text)
# file is automatically closed
fd.seek(0)

In [None]:
import os
# Delete the file
os.remove(myfn)

In [None]:
!dir mytest.txt

In [None]:
# Looking to see if the demo files are available with a Jupyter special execute prefix (!):
!dir testdata

In [None]:
filename = 'SSHOW_SYS.txt'
filename = 'AllConf.csv'
pathname = os.path.join('testdata', filename)
if os.path.exists(pathname): print('Yes') 

In [None]:
if os.path.exists(pathname):
    count = 0
    with open(pathname) as fd:
        line = fd.readline()
        while line != '':  # The EOF is a n empty string
            line=line.strip()
            print(line)
            count += 1
            if count > 3: break
            line = fd.readline()

In [None]:
if os.path.exists(pathname):
    count = 0
    with open(pathname) as fd:
        for line in fd.readlines():   # this will return a full list of the entire file
            line=line.strip()   # Comment this line and see what happens
            print(line)
            count += 1
            if count > 3: break

In [None]:
# This final approach is more Pythonic and can be quicker and more memory efficient. 
# Therefore, it is suggested you use this instead.
if os.path.exists(pathname):
    with open(pathname) as fd:
        for i, line in enumerate(fd):   # Notice how iterating over a file descriptor is same as issung .readline()
            line = line.strip()
            print(f'[{i}] {line}')
            if i > 3: break  

In [None]:
# An example of opening a zip file and directly reading the internal file 
import zipfile
with zipfile.ZipFile(r'testdata\allconf.zip', 'r') as zipobj:
    text = zipobj.read('AllConf.csv').decode()   # A zipfile object is opened as binary so may need decode to string 

for i, line in enumerate(text.split('\n')):
    print(line)
    if i > 5:
        break

### Working with two files at once:
There are times when you may want to read a file and write to another file at the same time. Here is an example:

``` python
d_path = 'dog_breeds.txt'
d_r_path = 'dog_breeds_reversed.txt'
with open(d_path, 'r') as reader, open(d_r_path, 'w') as writer:
    dog_breeds = reader.readlines()
    writer.writelines(reversed(dog_breeds))
```

### Don't reinvent the snake:
Additionally, there are built-in libraries out there that you can use to help you:

* wave: read and write WAV files (audio)
* aifc: read and write AIFF and AIFC files (audio)
* sunau: read and write Sun AU files
* tarfile: read and write tar archive files
* zipfile: work with ZIP archives
* configparser: easily create and parse configuration files
* xml.etree.ElementTree: create or read XML based files
* msilib: read and write Microsoft Installer files
* plistlib: generate and parse Mac OS X .plist files

There are plenty more out there. Additionally there are even more third party tools available on PyPI. Some popular ones are the following:

* PyPDF2: PDF toolkit
* xlwings or : read and write Excel files
* xlsxwriter : write Excel files
* Pillow: image reading and manipulation

## Exercise:
1. In Vscode create a `mypackage` folder under the `pytut` directory (which you should have created previously).
1. Copy all the `*.txt` `*.csv` files from PytutButty01 into this new directory.
1. Copy the `CLIskeleton.py` (Under PytutButty03) program into this folder and rename it `mygrep.py`
1. Write CLI program to accept a filename and text string:

```
mygrep.py [options] <filename> <text string>
-c = case sensitive
```

 - It needs to output any lines containing the text string.
 - By default supplied text searching should be case insensitive.



## Exercise (Part2):
1. In the mypackage folder copy the `CLIskeleton.py` (Under PytutButty03) program into this folder and rename it `<name>tool.py`. 
1. You can play around & test in the notebook but the objecting is to do the following is a stand alone CLI program, so edit and modify the skeleton in Vscode:

- Write some code to process either the `SSHOW_SYS.txt` or `Allconf.csv` or `poolinfo.txt` and print out your favorite section.
    - `SSHOW_SYS.txt` example: text between `???/switchshow` to blank line.
    - `Allconf.csv` example: text between `<<System Option Information>>` to next `<<?>>` section.
    - `poolinfo.txt` example: text between `POOL-ID` and a blank line.

### As a hint:
 - After the argparse processing call a new function `main(args.filename)`.
 - Create a new main function to process and output the file.
 - It can be a good idea to logically put in an output limit while initially developing (e.g. maximum 20 lines). 