<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Files, Folders & OS (Need)</span></div>

Need to communicate with the OS to create, modify, move, copy, and delete files and directories (folders).This introduces some Python modules (e.g., ``os, glob, shutil``) that will help execute these necessary actions. It will also show how to write code that will seamlessly run on both macOS and Windows.

# Important concepts (Navigating OS)

## Path

Path is simply a way to specify a location on your computer. Can specify your path **absolutely** or **relatively**. 

In [7]:
path1= "C:\\Users\\teren\\OneDrive\\Desktop\\learning-portfolio-Terencetjc\\functions\\output.txt"
#Or we can do path = r"C:\Users\teren\OneDrive\Desktop\learning-portfolio-Terencetjc\functions\output.txt"
# as A raw string (r or R) in Python tells Python to treat backslashes (\) as literal characters rather than escape characters. 
#This means that backslashes are not treated specially and are included in the string as-is.

## More about relative paths

| Notation |       Meaning      |
|:--------:|:------------------:|
|     .    |    ‘this folder’   |
|    ..    | ‘one folder above’ |

``.\data-files\data-01.txt`` means the file data-01.txt in the folder data-files in the **current** folder. <br>
``..\data-files\data-01.txt`` means the file data-01.txt in the folder data-files located in the folder **above**.

### macOS or Linux

In [12]:
#macOS and Linux allow you to use ~ to refer to your home directory. 
#So, for example, you can access the Desktop in these systems ‘relatively’ with ~/Desktop.

#~\Desktop\output.txt some software can use but not universally recognized by all programs or scripts running in a Windows environment.
import os

file_path = os.path.join(os.environ['USERPROFILE'], 'Desktop', 'output.txt') 
#This is used to fetch the file path if I don't know where the file is. It is like search bar. 

## Path separator

Windows uses \ as the path separator while macOS (or Linux) uses /. So, the absolute path to a file on the Desktop on each of these systems will look like this:
| System             | File path                                  |
|--------------------|--------------------------------------------|
|      Windows       | ``C:\\Users\chammika\Desktop\data-01.txt`` |
| macOS (or Linux)   | ``/Users/chammika/Desktop/data-01.txt``    |


If want to share the code and work on both systems,must not **hardcode** either path separator. Learn how to use ``os`` package to fix the problem. 

## 1.4 Text files vs. Binary files

Text files are straightforward and universally readable, accessible through various software like Notepad, TextEdit, or Jupiter, and typically come with extensions like ``.txt``, ``.md``, or ``.csv.`` <br>

Conversely, binary files necessitate interpretation to extract meaningful data; for instance, viewing the contents of a ``.png`` file in its raw form would yield unintelligible characters. Moreover, some binary files are platform-specific, meaning they can only execute on particular operating systems. For instance, ``Excel.app`` is exclusive to macOS, while ``Excel.exe`` is tailored for Windows, precluding cross-platform compatibility. 

Prefer: Binary files often prioritize efficiency and reduced size, contrasting text files which, although less complex, can grow considerably in size.




## Extensions

Files are usually named to end with an extension separated from the name by a ``.`` like ``name.extension``. This ``extension`` lets the OS know what software or app to use to extract the details in a file. For example, a ``.xlsx`` means use Excel or ``.pptx`` means use PowerPoint. Be careful about changing the extension of a file, as it will make your OS cough and throw a fit. 

# Opening and closing files

A better way for opening a file for reading and writing uses the ``with`` statement (called context manager). 

## Reading data

In [19]:
with open('spectrum-01.txt', 'r') as file: #open() function ‘opens’ your file.'r' specifies that I only want to read from the file.
    file_content = file.read() #Using with frees you from worrying about closing the file after you are done.

print(file_content)

Light Intensity, Ch A vs Actual Angular Position, Run #4
Actual Angular Position (  )	Light Intensity, Ch A ( % max )
0.000	-0.2
0.000	-0.1
0.000	-0.1
0.000	-0.1
0.000	-0.1
0.000	-0.2
0.000	-0.1
0.000	-0.1
0.000	-0.1
0.000	-0.2
0.000	-0.1
0.000	-0.1
0.000	-0.2
0.000	-0.3
0.000	-0.2
0.000	-0.2
0.001	-0.1
0.001	-0.1
0.001	-0.1
0.001	-0.1
0.001	-0.1
0.001	-0.1
0.004	-0.1
0.010	-0.2
0.018	-0.2
0.024	-0.3
0.029	-0.3
0.033	-0.3
0.036	-0.2
0.039	-0.1
0.043	-0.1
0.047	-0.1
0.053	-0.1
0.060	-0.1
0.066	-0.1
0.069	-0.1
0.073	-0.1
0.076	-0.1
0.079	-0.1
0.081	-0.1
0.082	-0.1
0.083	-0.2
0.083	-0.2
0.086	-0.2
0.090	-0.2
0.095	-0.2
0.100	-0.3
0.103	-0.3
0.104	-0.2
0.105	-0.3
0.107	-0.2
0.110	-0.2
0.115	-0.1
0.122	-0.2
0.128	-0.1
0.134	-0.2
0.139	-0.1
0.144	-0.2
0.150	-0.2
0.157	-0.2
0.164	-0.2
0.170	-0.3
0.175	-0.3
0.180	-0.2
0.185	-0.2
0.191	-0.1
0.195	-0.1
0.198	-0.2
0.201	-0.1
0.204	-0.2
0.206	-0.2
0.208	-0.3
0.210	-0.3
0.213	-0.1
0.217	0.3
0.222	0.6
0.226	0.2
0.230	0.0
0.233	-0.1
0.235	-0.1
0.237	

## Writing data

In [22]:
text = 'Far out in the uncharted backwaters of the unfashionable end of the western spiral arm of the Galaxy lies a small unregarded yellow sun.\nOrbiting this at a distance of roughly ninety-two million miles is an utterly insignificant little blue green planet whose ape-descended life forms are so amazingly primitive that they still think digital watches are a pretty neat idea.'
#Learn how to write text into a file. 2 writing methods explained below. 

### Writing to a file in one go

In [26]:
with open('my-text-once.txt', 'w') as file: #This function call opens the file named 'my-text-once.txt' in write mode ('w'). 
                                            #If the file does not exist, it will be created. If the file already exists, its contents will be emptied. 
                                            #as helps assign it as file so can just use file later.
    file.write(text) #writes the contents of the text variable to the file opened earlier. This will overwrite any existing content in the file.
#with statement is used in conjunction with open() to ensure that the file is properly closed after its use

### Writing to a file, line by line

In [27]:
with open('my-text-lines.txt', 'w') as file:
    for line in text.splitlines(): #This iterates over each line of text in the variable text. 
                                   #The splitlines() method is used to split the text into individual lines.
        file.writelines(line)      #write each line of text to the file. Since writelines() expects a list of strings, each line is written individually.

# Some useful packages

| Package | Primarily used for                                                                |
|---------|-----------------------------------------------------------------------------------|
| os      | To ‘talk’ to the OS to create, modify, delete folders and write OS-agnostic code. |
| glob    | To search for files.                                                              |
| shutil  | To copy files.                                                                    |

While both ``os`` and ``shutil`` modules offer functionalities for file and directory operations, shutil provides a more specialized and convenient interface for handling these tasks.<br>
``shutil`` module is particularly useful for file operations, such as copying files (shutil.copy()), moving files (shutil.move()), and deleting files (shutil.rmtree()). os cannot offer these.

In [1]:
import os
import glob
import shutil

# OS safe paths

Consider a file data-01.txt in the sub-directory sg-data of the directory all-data.

all-data --> sg-data --> data-01.txt

If I want to access data-01.txt all I have to do is:

In [30]:
path = os.path.join('.', 'all-data', 'sg-data', 'data-01.txt')
print(path) #If not on windows, output will be './all-data/sg-data/data-01.txt'

.\all-data\sg-data\data-01.txt


using **os.path.join()** will adjust your path with either / or \ as necessary. This means your code will seamlessly run on all the OS.

# Folders

## Creating folders

Can create a folder programatically using ``os.mkdir()``. Useful because can quickly organize data. <br>

Example: If need to store information about "John", "Paul","Ringo":

In [3]:
shutil.rmtree('people') #This deletes any existing file first so can be recreated, and i just realised i solved it before the next part of try-except.
os.mkdir('people')

for person in ['John', 'Paul', 'Ringo']:
    path = os.path.join('people', person)
    print(f'Creating {path}')
    os.mkdir(path)

Creating people\John
Creating people\Paul
Creating people\Ringo


## Checking for file existence

### Using try-except

In [46]:
for person in ['John', 'Paul', 'Ringo']:
    path = os.path.join('people', person)
    try:
        os.mkdir(path)
        print(f'Creating {path}')
    except FileExistsError:
        print(f'{path} already exists; skipping creation.')

people\John already exists; skipping creation.
people\Paul already exists; skipping creation.
Creating people\Ringo


### Using os.path.exists()

In [5]:
for person in ['John', 'Paul', 'Ringo']:
    path = os.path.join('people', person)
    if os.path.exists(path):
        print(f'{path} already exists; skipping creation.')
    else:
        os.mkdir(path)
        print(f'Creating {path}')

people\John already exists; skipping creation.
people\Paul already exists; skipping creation.
people\Ringo already exists; skipping creation.


## Copying files

In [27]:
for person in ['John', 'Paul', 'Ringo']:
    path_to_destination = os.path.join('people', person)
    shutil.copy('sp2273_logo.png', path_to_destination)
    print(f'Copied file to {path_to_destination}')

Copied file to people\John
Copied file to people\Paul
Copied file to people\Ringo


In [28]:
# I want all the images in a sub-folder called imgs in each person’s directory. 
# I can do this by first creating the folders imgs and then moving the logo file into that folder.
for person in ['John', 'Paul', 'Ringo']:
    # Create folder 'imgs'
    path_to_imgs = os.path.join('people', person, 'imgs')
    if not os.path.exists(path_to_imgs):
        os.mkdir(path_to_imgs)

    # Move logo file
    current_path_of_logo = os.path.join('people', person, 'sp2273_logo.png')
    new_path_of_logo = os.path.join('people', person, 'imgs', 'sp2273_logo.png')

    shutil.move(current_path_of_logo, new_path_of_logo)
    print(f'Moved logo to {new_path_of_logo}')

Moved logo to people\John\imgs\sp2273_logo.png
Moved logo to people\Paul\imgs\sp2273_logo.png
Moved logo to people\Ringo\imgs\sp2273_logo.png


# Listing and looking for files (glob)

In [30]:
#I use this if I want all the files in the current directory.
#The * is called a wildcard and is read as ‘anything’. So, I am asking glob to give me anything in the folder.
glob.glob('*')

['-p',
 'files,_folders_&_os_(need).ipynb',
 'my-text-lines.txt',
 'my-text-once.txt',
 'people',
 'sp2273_logo.png',
 'spectrum-01.txt']

In [31]:
#If I want to refine my search and ask glob to give only those files that match the pattern ‘peo’ followed by ‘anything’.
glob.glob('peo*')

['people']

In [32]:
#If want to know what is inside the folders that start with peo.
glob.glob('peo*/*')

['people\\John', 'people\\Paul', 'people\\Ringo']

In [33]:
#If I want to see the whole, detailed structure of the folder people. 
#For this, I need to tell glob to search recursively (i.e. dig through all sub-file directories) by putting recursive=True.
#I must also use two wildcards ** to say all ‘sub-directories’.
glob.glob('people/**', recursive=True)

['people\\',
 'people\\John',
 'people\\John\\imgs',
 'people\\John\\imgs\\sp2273_logo.png',
 'people\\Paul',
 'people\\Paul\\imgs',
 'people\\Paul\\imgs\\sp2273_logo.png',
 'people\\Ringo',
 'people\\Ringo\\imgs',
 'people\\Ringo\\imgs\\sp2273_logo.png']

In [34]:
#If I want only the .png files. I am asking glob to go through the whole structure of people and show me those files with the pattern ‘anything’.png.
glob.glob('people/**/*.png', recursive=True)

['people\\John\\imgs\\sp2273_logo.png',
 'people\\Paul\\imgs\\sp2273_logo.png',
 'people\\Ringo\\imgs\\sp2273_logo.png']

# Extracting file info

In [36]:
#When dealing with files and folders, you often have to extract the filename, folder or extension. 
#You can do this by simple string manipulation; for example if I want the filename and extension:
path = 'people/Ringo/imgs/sp2273_logo.png'
filename = path.split(os.path.sep)[-1]
extension = filename.split('.')[-1]
print(filename, extension)

people/Ringo/imgs/sp2273_logo.png png


``os.path.sep`` is the path separator (i.e. ``\`` or ``/``) for the OS. I split the path where the separator occurred and picked the last element in the list. I use a similar strategy for the file extension.

However, if you like, ``os`` provides some simple functions for these tasks.

In [38]:
path = 'people/Ringo/imgs/sp2273_logo.png'

In [39]:
os.path.split(path)      # Split filename from the rest

('people/Ringo/imgs', 'sp2273_logo.png')

In [40]:
os.path.splitext(path)   # Split extension

('people/Ringo/imgs/sp2273_logo', '.png')

In [41]:
os.path.dirname(path)    # Show the directory

'people/Ringo/imgs'

# Deleting stuff

In [42]:
#If you want to remove a file:
os.remove('people/Ringo/imgs/sp2273_logo.png')

In [None]:
#This won’t work with directories. For an empty directory, use:
os.rmdir('people/Ringo')

In [47]:
#For a directory with files, use shutil:
shutil.rmtree('people/Ringo')

#Be careful when using these functions.