<div style="text-align:left;font-size:2em"><span style="font-weight:bolder;font-size:1.25em">SP2273 | Learning Portfolio</span><br><br><span style="font-weight:bold;color:darkred">Files, Folders & OS (Need)</span></div>

# What to expect in this chapter
1. Learning how to use the following Python modules: `os`, `glob`, and `shutil`
2. These modules will allow us to communicate with the OS to create, modify, move, copy and delete files+directories (folders)


# 1 Important concepts

## 1.1 Path
1. Essentially refers to the address/location of a particular file inside the computer. 
2. There are two main types of paths: 
    1. Absolute - gives the direct and specific location of a particular file:

    `"C:\Users\SHRINJANA\Documents\GitHub\learning-portfolio-shrin0811\using jupyter\postrendering.png"`

    The above is the absolute path of the file `postrendering.png`

    2. Relative - Refers to the location of one file _with respect_ to another. 

## 1.2 More about relative paths

Important notation to keep in mind when talking about relative paths: 

|Notation|Meaning|Use|Interpretation|
|:--:|:--:|:--:|:--:|
|`.`|'this folder'|`.\foldername1\filename2.extension`|The file `filename2.extension` exists in `foldername1`|
|`..`|'one folder above'|`..\foldername1\filename2.extension`|The file `filename2.extension` exists in the file above`foldername1`|



### macOS or Linux

Allows users to use `~` to refer to the PC's home directory - therefore the `Desktop` can be relatively accessed using the following notation: `~\Desktop`

## 1.3 Path separator

This is the primary difference between absolute paths from Linux/macOS and Windows. It is necessary to ensure that if a collaboration is happening between both types of systems, the separators are not **hardcoded**. This problem is fixed using the Python `os` package. 

Path separator for Windows: `\` - backslash

Path separator for macOS or Linus: `/` - forward slash

## 1.4 Text files vs. Binary files

1. Text files:
    1. Easy to understand and readable
    2. Can be opened and edited on any software on any OS
    3. Some example extensions: `.txt`, `.rtf`, `.md`, `.csv`

2. Binary files:
    1. Usually formatted in machine language - and hence require some processing to make sense of what they contain - i.e. if we try opening and reading their raw data in a `.png` file, we will see gibberish. 
    2. Some binary files will only run on specific OSs - for eg: `.app` extension will work on macOS and not on Windows, while the reverse applies for a `.exe` extension. 
    3. They are better that text files when size and speed are taken into consideration. 

## 1.5 Extensions

1. Standard format for naming files: `filename.extension`
2. The part after the `.` essentially indicates to the processor which software/app will be the best and most efficient to extract the details in the file. 

# 2 Opening and closing files
_learning how to use the `with` statement (called a 'context manager)_

## 2.1 Reading data

Here is the standard code chunk to open and read a file (ensure that when you are coding with the file, it is in the same directory/file as the code file). 

```python
with open('name.txt', 'r') as file:
    file_content = file.read()
print(file_content)
```

|Function/code statement|Meaning/purpose|
|:--:|:--:|
|`open()`|opens the file inside the interpreter|
|`r`|specifies that the code only wants to read the file, and not edit it|
|`with`|closes the file after use by the interpreter|

## 2.2 Writing data

Let this be the text we wish to edit: 

`abc`= _"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Id nibh tortor id aliquet lectus proin nibh. At augue eget arcu dictum varius duis at consectetur lorem. Rhoncus urna neque viverra justo nec ultrices dui. Vestibulum mattis ullamcorper velit sed ullamcorper. Mi in nulla posuere sollicitudin aliquam ultrices sagittis orci."_

### Writing to a file in one go

Generic syntax for this: 
```python
with open('filename.txt', 'w') as file:
    file.write(abc)
```

Remarks: 
1. The `w` indicates to the interpreter that the file that has to be processed, has to be opened for the editing/writing. 
2. Moreover, if the file `filename.txt` did not exist in the working directory before this, then the file will be created for, and then the text saved as `abc` will be added to it. 

### Writing to a file, line by line

_note: this process is much slower than writing/editing the text in one go - the iterative loop slows the process down_ 

Generic syntax:
```python
with open('filename.txt', 'w') as file:
    for line in abc.splitlines():
        file.writelines(line)
```
|Syntax|Meaning/Purpose|
|:--:|:--:|
|`abc.splitlines()`|ensures that the text stored in `abc` in broken down in separate lines|
|`writelines(arg)`|adds each line (`arg`) separately, for which the file will be rendered/processed a little differently, as compared to the one where the text was added as a chunk|


# 3 Some useful packages

|Package|Primary use|
|:--:|:--|
|`os`|To help developers write code that is OS-blind (i.e. is not OS-specific), and to create, modify and delete folders across operating systems - thus helping in collaborative code writing|
|`glob`|To search for files in the directory|
|`shutil`|To copy files|

# 4 OS safe paths

_ensure that the `os` package has been imported before hand_

The function `os.path.join('.', 'main directory', 'sub-directory', 'filename.extension')` - ensures that the difference of the path separator does not occur i.e. if the path was printed: 
1. For windows: `.\\main-directory\\sub-directory\\filename.extension`
2. For macOS/Linux: `./main-directory/sub-directory/filename.extension`

# 5 Folders
_learning to use the function `os.mkdir()`_ 

## 5.1 Creating folders

This is a vital skill to know if we are to learn how to organize files in main/sub-directories. 

Let `files` be a list of 'n' files, and let `subdirec` be the folder in which they are supposed to be arranged. 

Algorithm:
1. Create the folder `subdirec` using `os.mkdir(subdirec)` function
2. Traverse the list of files 
3. Create the path using the `os.path.join(params)` function
4. Add the files using the specified path using the `os.mkdir(specified_path)` function 

Syntax:
```python
os.mkdir('subdirec')
for name in files:
    path=os.path.join('subdirec', name)
    os.mkdir(path)
```

## 5.2 Checking for existence

Python does not allow the develop to create/add the same file at the same path/location - it raises a `FileExistsError`. 

### Using try-except

Generic syntax to catch the error thrown and handle it: 

(assume that the required mods have been imported and that the directories have been created)

```python
for name in files:
    path=os.path.join('subdirec', name)
    try:
        os.mkdir(path)
    except FileExistsError:
        print("Skip creation - already exists.")
```

### Using os.path.exists()
_the good thing about the `os` package is that it usually has in-built functions to deal with common issues that may arise with file-handling_ 

Generic syntax to implement this very useful function: 

_assume that required packages have been imported and the necessary directories have been created_

```python
for name in files: 
    path=os.path.join('subdirec', name)
    if os.path.exists(path):
        print("Skipping creation as file already exists.")
    else:
        os.mkdir(path)
```


## 5.3 Copying files

Type 1: Copying directly to a set of files

Algorithm: 
1. Define the path that you want to copy your required file to. 
2. Use the `shutil.copy('filename.extension', path)` function to copy your required file from the existing directory to the required directory. 
3. Traverse your necessary list of files in the same way if your required file has to be copied to more than one destination. 

Generic syntax:
_Assume that the your file is: `abc.txt`. The required packages have been imported, and we are using the same files and directories as before._

```python
for name in files:
    destination=os.path.join('subdirec', name)
    shutil.copy('abc.text', destination)
```

Type 2: Arranging your folder by creating a new sub-folder and then copying into it

Algorithm: 
1. Create the new sub-folder
    1. Check if the folder exists or not. 
2. Access the current path of the your file
3. Define the path for the new destination of your file. 
4. Use the function `shutil.move(current_path, new_path)` to move and organize the file in the newly created sub-folder within the same directory. 
5. Repeat this for each file in your list, if this is necessary for more than one. 

Generic syntax: 

_assume that the new sub-folder will be called `texts`_

```python
for name in files: 
    sub_fol=os.path.join('subdirec', name, 'texts')
    if not os.path.exists(sub_fol):
        os.mkdir(sub_fol)
    
    current=os.path.join('subdirec', name, 'abc.txt')
    new=os.path.join('subdirec', name, 'texts', 'abc.txt')
    shutil.move(current, new)
```


# 6 Listing and looking for files

_essentially learning the uses of `glob` package_

The different statements are explained in this table below:

|Statements|Meaning|
|:--:|:--|
|`glob.glob(*)`|Returns all the files in the current working directory|
|`glob.glob('peo*')`|Returns all the files in the current working directory that begin with 'peo' followed by any set of characters|
|`glob.glob('peo*/*')`|Returns all the files/folders inside the folders that begin with 'peo'|
|`glob.glob('people/**', recursive=True)`|Returns all the entire detailed structure of the folder 'people' - the interpreter must go through the file recursively to ensure that it goes each sub-folder and file inside it; hence we use `**` to indicate the sub-directories, and `recursive=True` to ensure complete parsing|
|`glob.glob('people/**/*.png', recursive=True)`|Returns only those files with a `.png` extension after parsing recursively through the entire structure of 'people'| 

# 7 Extracting file info

This can be done using two methods: 
1. String manipulation of the file name
    1. In this case we use the attribute `os.path.sep` - which allows the interpreter to sort of compartmentalize the different sub-directories of the path. 
    2. Here is the generic syntax used:
    
        ```python
        path='main_directory/sub_directory/sub-sub_directory/filename.extension' 
        fil_name=path.split(os.path.sep)[-1] 

        #using the split function, it separates the path string into different tokens at the points where the path separator has been found

        fil_extention=fil_name.split('.')[-1] 
        
        #further using the split function in the similar way as above to separate the extension from the rest of the file name, as that will not be detected by the os.path.sep attribute
        ```


2. Using in-built `os` functions: 
    1. Let the path for the following explanations be as follows: 
    ```python
    path='main_directory/sub_directory/sub-sub_directory/filename.extension'
    ```
    2. The explanation of the different functions are as follows: 

    |Function|Meaning|
    |:--:|:--|
    |`os.path.split(path)`|Separates **only** the file name from the path|
    |`os.path.splitext(path)`|Separates **only** the extension from the path|
    |`os.path.dirname(path)`|Separates the sequence of folders/directories from the rest of the path


# 8 Deleting stuff

_be careful!_

|Functions/Commands|Purpose|
|:--:|:--|
|`os.remove('entire_path_of_the_file_to_be_deleted')`|Only to remove a specific file|
|`os.rmdir('path of empty directory)`|Only to be used for empty directories|
|`shutil.rmtree('path of non-empty directory')|To be used to remove folders with files (exercise caution!)|