# Lesson 03: File paths and looping

# 1. Navigate your computer's file system with `pathlib`

Using the pathlib module from the Python standard (built-in) library, you can easily navigate and manipulate files on your computer

```python
import pathlib # Loads the pathlib module into memory
```

The primary object within the pathlib module is `Path`. You create a new `Path` object by calling it like a function.

```python
my_path = pathlib.Path() # Relative path to cwd (wherever your current notebook is)
```

This creates a _relative_ Path object that represents your current working directory. No matter where you are in your computer, your relative path will be represented by `.`. If you were to navigate a directory down, your path would become `./new_directory/`.

If you would like to make your path object an _absolute_ path, you do so with the `.resolve()` method:

```python
my_path_abs = my_path.resolve()
my_path_abs
```

## Things you can do with a `Path` object

### Navigate to a new directory

```python
my_new_path = my_path_abs / "New_directory_name" # Use the divide operator, neat huh?
```

### Make a new directory

```python
my_new_path = my_path_abs / "New_directory_name" # This directory does not exist yet
my_new_path.mkdir() # Now it does
```

### Make a path to a file

```python
# This file path does not exist yet but that's ok
my_file = my_path_abs / "New_directory_name" / "My_excel_file.xlsx"
```

### Check to see if a path exists
```python
my_new_path.exists()
my_file.exists()
```

### Create a new empty file
```python
my_file.touch()
my_file.exists()
```

### Rename a file
```python
my_file.rename(my_file.parent / "new_file_name.txt") # Use absolute path, can be used to move files too
```

### And more...

In **Edit mode**, type `my_new_path.` then hit `[TAB]` and wait for Jupyter to show you the list of all the other methods and attributes of your path!

We will come back to paths at the end of this lesson...

# 2. Automating the work: `for` loops

Part of the reason we are doing this Python stuff is to have the computer do our repetitive work for us. We accomplish this by putting a task into a loop and instruct the computer to repeat the loop until the work is done.

In Python, the primary tool for this is the `for` loop.

## Looping Syntax

```python
for <item_description> in <collection_of_items>:
    <the code>
    <for our task(s)>
    <goes here...>
```

A quick example:

```python
animals = ["bear", "coyote", "otter", "marmot", "crow", "whale"]

crossword_clues = [] # An empty list
for animal in animals:
    crossword_clue = f"{animal.upper()} - {len(animal)} letters"
    crossword_clues.append(crossword_clue)

print(crossword_clues)

```

## Understanding the loop syntax

```python
for <item> in <collection_of_items>:
```

In order to loop, we need to have something that we can loop over. That is, we need an _iterable_ of some kind. Many things in Python are _iterable_ including lists and strings (remember, a string is a collection of characters).

While we are in the loop, the loop visit each item in the collection sequentially. To refer to the current item, we give it a name. That name can be anything we want but it's best if the name relates to the name of the iterable.

In other words:

```python
for <name_given_to_the_current_item> in <my_collection_of_items>:
```

Here are some examples:

```python
multiples_3 = [3, 6, 9, 12]
for multiple in multiples_3:
    print(multiple)
```

But this also works (although what's happening is not as apparent):

```python
turkies = [5, 6, 7, 8] 
for book in turkies: # < - Books? In turkies?
    print(book)
```

## "Pythonic" looping vs. "Non-pythonic" looping

If you have experience Matlab, C, or C++, then you will be use to seeing a different kind of loop:

```python
multiples_3 = [3, 6, 9, 12]

for idx in range(len(multiples_3)):
    print(multiples_3[idx])
```

In this kind of loop, you are not given the actual items in the collection. You are _generating_ integers in a sequence and using those integers to retrieve the item in the collection using indexing.

If you do this in Python, people will accuse you of being "non-pythonic". Sure, you can do it and it will work just fine.  However, you are not taking advantage of the fact that Python just gives you what you need without having to use indexing.

**But!!! What if you need the index for some reason???**

Ah! There is a pythonic solution for this: `enumerate`.

# Looping Recipes

## Transforming items in a list (or other collection)

A common application for looping is **transforming** data in a `list` or other kind of collection and putting the transformed data into a new `list`.

This is the general recipe for that:

```python
your_data = ['data1', 'data2', 'data3', ...]

acc = [] # Your "accumulator", an empty list
for <item> in <your_data>:
    new_item = <do something with item>
    acc.append(new_item)
```

And here is an example of that:

```python
column_dimensions = [[300, 600], [200, 600], [400, 400]] # l x w

etabs_labels = [] # The accumulator
for column_dim in column_dimensions: # Each item is a list of two numbers
    length = column_dim[0]
    width = column_dim[1]
    etabs_label = f"COL{length}X{width}"
    etabs_labels.append(etabs_label)
```


## Accessing data from two lists

Sometimes, you have two lists of data that are the same length and you wish to combine them in some way.

This is the general recipe for that:

```python
data_list_1 = [...]
data_list_2 = [...]

acc = [] # Accumulator
for idx, item in enumerate(data_list_1):
    other_item = data_list_2[idx]
    new_item = <do something with item and other_item>
    acc.append(new_item)
```

And here is an example of that:

```python
column_widths = [300, 300, 300, 400, 400, 400]
column_lengths = [400, 600, 900, 400, 600, 800]

column_sizes = []
for idx, width in enumerate(column_widths):
    length = column_lengths[idx]
    column_size = f"COL{width}X{length}"
    column_sizes.append(column_size)
```



## Double loops

Sometimes, you have a _nested_ collection, collections inside of collections, and you need to access items within the inner collection.

Here is the general recipe for that:

```python
nested_data = [[...], [...], ...]

outside_acc = [] # "Outside" single accumulator
for outside_item in nested_data:
    inner_acc = [] # You may or may not need an inner accumulator
    for inside_item in outside_item:
        new_inside_item = <do something with inside item>
        inner_acc.append
    new_outside_item = inner_acc
    outside_acc.append(new_outside_item)
```

And here is an example of that:

```python
# A list of beam spans 
# Each sub list represents one beam with it's spans
# Dimensions are in strings in ft'in but we want as numbers in decimal feet
beams_with_spans = [["12'6", "4'8"], ["3'2", "6'3", "4'8"], ["20'2"]]

beam_spans_ft = [] # Outer accumulator
for beam_spans in beams_with_spans:
    beam_ft = [] # Inner accumulator
    for span in beam_spans:
        feet_as_str = span.split("'")[0]
        inches_as_str = span.split("'")[1]
        
        feet_as_num = float(feet_as_str)
        inches_as_num = float(inches_as_str)
        
        decimal_feet = feet_as_num + inches_as_num / 12
        beam_ft.append(decimal_feet)
    beam_spans_ft.append(beam_ft)
```
 
    
  
        

# Looping Recipe: Looping over lines in a file

A very common application for looping is for looping over lines in a file, both for reading a file and writing a file.

In your lesson directory, there is a file called `my_col.cti`. It is an spColumn .cti file which is a text-based version of a .col file.

Let's open the file and add the lines of the file to an accumulator.

```python
here = pathlib.Path()
cti_file = here / "beam_0.txt"
print(cti_file.exists())
```

```python
file_data = [] # Our accumulator for the data
with open(cti_file, "r") as file:
    for line in file.readlines():
        file_data.append(line)
```

Now that we have the data stored in a list called `file_data`, we can look at it and manipulate it directly in Python.

```python
file_data[1:3] # Show line #s 1 and 2 (stop BEFORE 3)
```

We can also write the file data into a new file:

```python
my_new_cti_file = here / "new_beam_0.txt"
with open(my_new_cti_file, "w") as file:
    for line in file_data:
        file.write(line)
```

# Looping Recipe: Looping over files in a directory

Using pathlib and loops, you can loop over files in a directory and do operations like bulk renaming.

## Looping over files

First, create a new absolute path to your current working directory ("cwd"):

```python
here = pathlib.Path.cwd()
```

Now, use the `.glob()` method to loop over all files:

```python
for file in here.glob("*"):
    print(file)
```

### What the heck is glob??

In short, `glob` (short for `global`) returns all file paths in the directory that match the pattern given to it. `"*"` is a "wild card" meaning "match all names". It was invented in 1969 in the first Unix operating system at Bell Labs (many pieces of software still use the same names that were invented at Bell Labs).

If you only want to see file names with a certain file extension, say `.txt`:

```python
for file in here.glob("*.txt"): # Will return all files ending in .txt
    print(file)
```

### Examples of bulk file operations

Create a bunch of empty text files (to start working on):

```python
new_dir = here / "New Directory"
new_dir.mkdir()
print(new_dir.exists())

file_names = ["a.txt", "b.txt", "c.txt", "d.txt", "e.txt"]
for file_name in file_names:
    file_path = new_dir / file_name
    file_path.touch()
```

Now, change the file names to look like RJC project file names:

```python
for file_path in new_dir.glob("*.txt"):
    name_template = "VAN.123456.0001-NOTE-20220115-CMF"
    new_file_name = f"{name_template}-{file_path.name}"
    print("Renaming file: ", file_path.name, new_file_name) # Test first
    # file_path.rename(file_path.parent / new_file_name)
```

### FOR GOODNESS'S SAKE!!! BE CAREFUL WHEN DOING BULK FILE OPERATIONS!!!

There is NO UNDO for this.

To be safe:

1. Make backups and keep them safe before attempting anything
2. Use the `print()` function to test what you are about to do before you do it
3. Only perform these operations on backed up files on your local machine. **DO NOT OPERATE ON FILES ON THE SHARED DRIVES OR ON ANYONE ELSE'S FILES! YOU _WILL_ BE SORRY OTHERWISE!**