<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Files, Folders & OS (Need)</span></div>

# Important concepts

## Path

A way to specify a location on the computer, brings you to a file or folder

**My link to the SP2273 folder in my computer**

In [2]:
C:\03 SzeChie\NUS\Courses\SP2273

SyntaxError: unexpected character after line continuation character (3587790101.py, line 1)

## More about relative paths

| **Notation** |    **Meaning**   |
|:------------:|:----------------:|
|       .      |    This folder   |
|      ..      | One folder above |

```.\data-files\data-01.txt``` means that the file ```data-01.txt``` is in the folder ```data-files```

```..\data-files\data-01.txt``` means that the file ```data-01.txt``` is located in the folder above

## Path separator

Windows uses \ as the path separator but macOS uses /

Hence, when sharing code, if we want it to work on both systems, we cannot hardcode either path separator.

## Text files vs. Binary files

Text files are simple and can be opened. Their contents can be examined by almost any software. Examples include ```.txt```, ```.md```, and ```.csv```.

Binary files require some processing to see what they contain. Examples include ```.png```.

Some binary files can only run on specific OSs. The ```Excel.app``` on a Mac will not run on Windows, and the ```Excel.exe``` file will not run on macOS as well.

## Extensions

Files are usually named to end with an extension like ```name.extension```. This extension allows the OS to know which software to use to extract details in a file.

For example, ```.xlsx``` uses Excel while ```.pptx``` uses Powerpoint

# Opening and closing files

## Reading data

In [4]:
with open('spectrum-01.txt', 'r') as file:
    file_content = file.read()

print(file_content)

Light Intensity, Ch A vs Actual Angular Position, Run #4
Actual Angular Position (  )	Light Intensity, Ch A ( % max )
0.000	-0.2
0.000	-0.1
0.000	-0.1
0.000	-0.1
0.000	-0.1
0.000	-0.2
0.000	-0.1
0.000	-0.1
0.000	-0.1
0.000	-0.2
0.000	-0.1
0.000	-0.1
0.000	-0.2
0.000	-0.3
0.000	-0.2
0.000	-0.2
0.001	-0.1
0.001	-0.1
0.001	-0.1
0.001	-0.1
0.001	-0.1
0.001	-0.1
0.004	-0.1
0.010	-0.2
0.018	-0.2
0.024	-0.3
0.029	-0.3
0.033	-0.3
0.036	-0.2
0.039	-0.1
0.043	-0.1
0.047	-0.1
0.053	-0.1
0.060	-0.1
0.066	-0.1
0.069	-0.1
0.073	-0.1
0.076	-0.1
0.079	-0.1
0.081	-0.1
0.082	-0.1
0.083	-0.2
0.083	-0.2
0.086	-0.2
0.090	-0.2
0.095	-0.2
0.100	-0.3
0.103	-0.3
0.104	-0.2
0.105	-0.3
0.107	-0.2
0.110	-0.2
0.115	-0.1
0.122	-0.2
0.128	-0.1
0.134	-0.2
0.139	-0.1
0.144	-0.2
0.150	-0.2
0.157	-0.2
0.164	-0.2
0.170	-0.3
0.175	-0.3
0.180	-0.2
0.185	-0.2
0.191	-0.1
0.195	-0.1
0.198	-0.2
0.201	-0.1
0.204	-0.2
0.206	-0.2
0.208	-0.3
0.210	-0.3
0.213	-0.1
0.217	0.3
0.222	0.6
0.226	0.2
0.230	0.0
0.233	-0.1
0.235	-0.1
0.237	

```open()``` 'opens' the file. ```'r'``` specifies that I only want to read from the file. ```with``` frees you from worrying about closing the file after you are done.

```python
with open(filename, 'r') as file:
    file_content = file.read()
```

## 2.2 Writing data

In [2]:
text = 'Far out in the uncharted backwaters of the unfashionable end of the western spiral arm of the Galaxy lies a small unregarded yellow sun.\nOrbiting this at a distance of roughly ninety-two million miles is an utterly insignificant little blue green planet whose ape-descended life forms are so amazingly primitive that they still think digital watches are a pretty neat idea.'

### Writing to a file in one go

In [3]:
with open('my-text-once.txt', 'w') as file:
    file.write(text)

Adds a file named ```my-text-once.txt``` into the directory. ```'w'``` indicates that I am opening the file for writing

### Writing to a file, line by line

In [4]:
with open('my-text-lines.txt', 'w') as file:
    for line in text.splitlines():
        file.writelines(line)

Writing the file line by line. Slower as this is done in a loop.

# 3 Some useful packages

In [5]:
import os
import glob
import shutil

```os``` is used to talk to the OS to create, modify, delete folders and write OS-agnostic code.

```glob``` is used to search for files.

```shutil``` is used to copy files.

# 4 OS safe paths

In [6]:
path = os.path.join('.', 'all-data', 'sg-data', 'data-01.txt')
print(path)

.\all-data\sg-data\data-01.txt


```all-data``` -> ```sg-data``` -> ```data-01.txt```

```os.path.join()``` will adjust the path and allow the code to run seamlessly on all OS.

# 5 Folders

## 5.1 Creating folders

In [7]:
os.mkdir('people')

for person in ['John', 'Paul', 'Ringo']:
    path = os.path.join('people', person)
    print(f'Creating {path}')
    os.mkdir(path)

Creating people\John
Creating people\Paul
Creating people\Ringo


Creating folders for John, Paul and Ringo.

## 5.2 Checking for existence

### Using try-except

In [8]:
for person in ['John', 'Paul', 'Ringo']:
    path = os.path.join('people', person)
    try:
        os.mkdir(path)
        print(f'Creating {path}')
    except FileExistsError:
        print(f'{path} already exists; skipping creation.')

people\John already exists; skipping creation.
people\Paul already exists; skipping creation.
people\Ringo already exists; skipping creation.


Python will not allow the code to run as the path already exists, so we cannot run the code twice.

### Using os.path.exists()

In [9]:
for person in ['John', 'Paul', 'Ringo']:
    path = os.path.join('people', person)
    if os.path.exists(path):
        print(f'{path} already exists; skipping creation.')
    else:
        os.mkdir(path)
        print(f'Creating {path}')

people\John already exists; skipping creation.
people\Paul already exists; skipping creation.
people\Ringo already exists; skipping creation.


## 5.3 Copying files

In [12]:
for person in ['John', 'Paul', 'Ringo']:
    path_to_destination = os.path.join('people', person) #path to each folder
    shutil.copy('sp2273_logo.png', path_to_destination)
    print(f'Copied file to {path_to_destination}')

Copied file to people\John
Copied file to people\Paul
Copied file to people\Ringo


In [14]:
for person in ['John', 'Paul', 'Ringo']:
    #Creating subfolder
    path_to_subfolder = os.path.join('people', person, 'img')
    if not os.path.exists(path_to_subfolder):
        os.mkdir(path_to_subfolder)
        print(f'Creating {path_to_subfolder}')

    #Current path
    path_to_destination = os.path.join('people', person, 'sp2273_logo.png') 

    #New path
    path_to_imgs = os.path.join('people', person, 'img', 'sp2273_logo.png')
    shutil.move(path_to_destination, path_to_imgs)
    print(f'Moved file to {path_to_imgs}')

Moved file to people\John\img\sp2273_logo.png
Creating people\Paul\img
Moved file to people\Paul\img\sp2273_logo.png
Creating people\Ringo\img
Moved file to people\Ringo\img\sp2273_logo.png


# 6 Listing and looking for files

In [15]:
glob.glob('*')

['files,_folders_&_os_(need).ipynb',
 'my-text-lines.txt',
 'my-text-once.txt',
 'people',
 'sp2273_logo.png',
 'spectrum-01.txt']

'*' is a wildcard and is read as 'anything'

In [16]:
glob.glob('peo*')

['people']

Refine search to give only files that match the pattern 'peo' followed by 'anything'

In [17]:
glob.glob('peo*/*')

['people\\John', 'people\\Paul', 'people\\Ringo']

Want to see what is inside the folders that start with ```peo```

In [18]:
glob.glob('people/**', recursive=True)

['people\\',
 'people\\John',
 'people\\John\\img',
 'people\\John\\img\\sp2273_logo.png',
 'people\\Paul',
 'people\\Paul\\img',
 'people\\Paul\\img\\sp2273_logo.png',
 'people\\Ringo',
 'people\\Ringo\\img',
 'people\\Ringo\\img\\sp2273_logo.png']

Want to see the whole detailed structure of the folder ```people```

Recursive, '**' used to say all 'sub-directories'

In [19]:
glob.glob('people/**/*.png', recursive=True)

['people\\John\\img\\sp2273_logo.png',
 'people\\Paul\\img\\sp2273_logo.png',
 'people\\Ringo\\img\\sp2273_logo.png']

Want only the ```.png``` files, go through the whole structure of ```people``` and show the files with the pattern 'anything'.png

# 7 Extracting file info

In [20]:
path = 'people/Ringo/imgs/sp2273_logo.png'
filename = path.split(os.path.sep)[-1]
extension = filename.split('.')[-1]
print(filename, extension)

people/Ringo/imgs/sp2273_logo.png png


In [21]:
os.path.split(path)      # Split filename from the rest

('people/Ringo/imgs', 'sp2273_logo.png')

In [22]:
os.path.splitext(path)   # Split extension

('people/Ringo/imgs/sp2273_logo', '.png')

In [23]:
os.path.dirname(path)    # Show the directory

'people/Ringo/imgs'

# 8 Deleting stuff

In [25]:
os.remove('people/Ringo/img/sp2273_logo.png')

Remove the file ```sp2273_logo.png``` from the subdirectory img from the folder Ringo

In [26]:
os.rmdir('people/Ringo')

OSError: [WinError 145] The directory is not empty: 'people/Ringo'

Used for empty directories

In [27]:
shutil.rmtree('people/Ringo')

Used for directory with files/folders