# Backslash on Windows and Forward Slash on OS X and Linux
On Windows, paths are written using backslashes ( \ ) as the separator between folder names. OS X and Linux, however, use the forward slash ( / ) as their path separator. If you want your programs to work on all operating systems, you will have to write your Python scripts to handle both cases.

Fortunately, this is simple to do with the `os.path.join()` function. If you pass it the string values of individual file and folder names in your path, `os.path.join()` will return a string with a file path using the correct path separators. Enter

In [1]:
import os
os.path.join('user','bin','spam')

'user/bin/spam'

The `os.path.join()` function is helpful if you need to create strings for filenames.

In [2]:
myFiles = ['accounts.txt','details.csv','invite.docx']

In [3]:
for filename in myFiles:
  print(os.path.join('C:/user/bin/',filename))

C:/user/bin/accounts.txt
C:/user/bin/details.csv
C:/user/bin/invite.docx


# The Current Working Directory
You can get the current working directory as a string value with the `os.getcwd()` function and change it with `os.chdir()`

In [4]:
import os
os.getcwd()

'/Users/carlos/intro-data-engineering/02-python-fundamentals/task-automation/read-write-files'

In [5]:
!wget 'https://github.com/carloslme/intro-data-engineering/raw/main/02-python-fundamentals/task-automation/organize-files/datasets.zip' -P './'

--2023-08-02 18:40:57--  https://github.com/carloslme/intro-data-engineering/raw/main/02-python-fundamentals/task-automation/organize-files/datasets.zip
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/carloslme/intro-data-engineering/main/02-python-fundamentals/task-automation/organize-files/datasets.zip [following]
--2023-08-02 18:40:58--  https://raw.githubusercontent.com/carloslme/intro-data-engineering/main/02-python-fundamentals/task-automation/organize-files/datasets.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8003::154, 2606:50c0:8000::154, 2606:50c0:8002::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8003::154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2644024 (2.5M) [application/zip]
Saving to: ‘./datasets.

> **IMPORTANT!**  
Unzip the `datasets.zip file` manually

In [8]:
os.chdir('./datasets')

In [9]:
os.getcwd()

'/Users/carlos/intro-data-engineering/02-python-fundamentals/task-automation/read-write-files/datasets'

If the folder does not exist

In [10]:
os.chdir('/ThisFolderDoesNotExist')

FileNotFoundError: [Errno 2] No such file or directory: '/ThisFolderDoesNotExist'

# Absolute vs. Relative Paths
There are two ways to specify a file path. 
* An absolute path , which always begins with the root folder 
* A relative path , which is relative to the program’s current working directory


There are also the dot (.) and dot-dot(..) folders. 
* A single period for a folder name is shorthand for "this directory".
* Two periods means "the parent folder".

# Creating New Folders with `os.makedirs()`
os.makedirs() will create any neccesary intermediate folders in order to ensure that the full path exists.


In [12]:
import os
os.makedirs('./dummy_directories/parent/son/grandson')

In [14]:
!ls ./dummy_directories

[34mparent[m[m


In [15]:
!ls ./dummy_directories/parent

[34mson[m[m


In [16]:
!ls ./dummy_directories/parent/son

[34mgrandson[m[m


# The `os.path` Module
The `os.path` module contains many helpful functions related to filenames and file paths.



## Handling Absolute and Relative Paths
* Calling `os.path.abspath(path)` will return a string of the absolute path of the argument. This is an easy way to convert a relative path into an absolute one. 
* Calling `os.path.isabs(path)` will return True if the argument is an absolute path and False if it is a relative path. 
* Calling `os.path.relpath(path,start)` will return a string of a relative path from the start path to path . If start is not provided, the current working directory is used as the start path.

In [17]:
os.path.abspath('.')

'/Users/carlos/intro-data-engineering/02-python-fundamentals/task-automation/read-write-files/datasets'

In [18]:
os.path.isabs('.')

False

In [20]:
os.path.relpath('/dummy_directories/parent','/dummy_directories/')

'parent'

The function below will calculate the relative path that you need to navigate from the base path to reach the target path. In this case, the target path is /dummy_directories/, and the base path is /dummy_directories/parent/son/grandson/. To go from the base path to the target path, you need to go two directories up and then one directory down. The resulting relative path will be '../../../dummy_directories/'.

Note that the relative path is calculated based on the directory structure, not the actual existence of directories. The function doesn't check whether the directories exist in the file system.

In [21]:
os.path.relpath('/dummy_directories/','/dummy_directories/parent/son/grandson/')

'../../..'

* Calling `os.path.dirname(path)` will return a string of everything that comes before the last slash in the path argument. 
* Calling `os.path.basename(path)` will return a string of everything that comes after the last slash in the path argument.

In [23]:
path = '/datasets/README.md'
os.path.basename(path)

'README.md'

In [24]:
os.path.dirname(path)

'/datasets'

`os.path.split()` is a nice shortcut if you need both values.

In [25]:
californiaFilePath = '/datasets/california_housing_test.csv'
os.path.split(californiaFilePath)

('/datasets', 'california_housing_test.csv')

`os.path.sep()` take a file path and return a list of strings of each folder.

In [26]:
californiaFilePath.split(os.path.sep)

['', 'datasets', 'california_housing_test.csv']

# Finding File Sizes and Folders Contents
The os.path module provides functions for finding the size of a file in bytes and the files and folders inside a given folder. 
* Calling `os.path.getsize(path)` will return the size in bytes of the file in the path argument. 
* Calling `os.listdir(path)` will return a list of filename strings for each file in the path argument. (Note that this function is in the os module, not `os.path` .)

In [29]:
!pwd

/Users/carlos/intro-data-engineering/02-python-fundamentals/task-automation/read-write-files/datasets


In [30]:
import os

os.path.getsize('california_housing_test.csv')

301141

In [31]:
os.listdir('./')

['california_housing_train.csv',
 'anscombe.json',
 'dummy_directories',
 'california_housing_test.csv',
 'README.md',
 'mnist_test.csv']

In [33]:
# Getting total size of all the files in the directory
totalSize = 0
for filename in os.listdir('./'):
  totalSize = totalSize + os.path.getsize(os.path.join('./', filename))
print(totalSize)

20299737


# Checking Path Validity
The os.path module provides functions to check whether a given path exists and whether it is a file or folder. 
* Calling `os.path.exists(path)` will return True if the file or folder referred to in the argument exists and will return False if it does not exist.
* Calling `os.path.isfile(path)` will return True if the path argument exists and is a file and will return False otherwise. 
* Calling `os.path.isdir(path)` will return True if the path argument exists and is a folder and will return False otherwise.


In [35]:
import os

os.path.exists('./dummy_directories')

True

In [36]:
os.path.exists('/test')

False

In [37]:
os.path.isdir('/dummy_directories/')

False

In [38]:
os.path.isfile('/dummy_directories')

False

In [39]:
os.path.isdir('anscombe.json')

False

# The File Reading/Writing Process
There are three steps to reading or writing files in Python. 

1.   Call the `open()` function to return a File object. 
2.   Call the `read(`) or `write()` method on the File object. 
3.   Close the file by calling the `close()` method on the File object.

## Opening Files with the `open()` Function
The `open()` function returns a File object.

---



In [40]:
'''filename = 'hello.txt'
dirname = os.path.dirname(filename)
if not os.path.exists(dirname):
  os.makedirs(dirname)'''
nameFile = './hello.txt'
helloFile = open(nameFile,'w')

In [41]:
with open(nameFile,'a') as f:
  f.write('Hello World!')
  f.close()

# Reading the Contents of Files
If you want to read the entire contents of a file as a string value, use the File object’s `read()` method.

In [42]:
helloFile = open(nameFile,'r')
helloContent = helloFile.read()
helloContent

'Hello World!'

Alternatively, you can use the `readlines()` method to get a list of string values from the file, one string for each line of text.

In [43]:
with open('./connet29.txt','w') as s:
  s.write('When, in disgrace with fortune and men\'s eyes, \n I all alone beweep my outcast state, \n And trouble deaf heaven with my bootless cries, \n And look upon myself and curse my fate,')
  s.close()

In [44]:
sonnetFile = open('./connet29.txt')
sonnetFile.readlines()

["When, in disgrace with fortune and men's eyes, \n",
 ' I all alone beweep my outcast state, \n',
 ' And trouble deaf heaven with my bootless cries, \n',
 ' And look upon myself and curse my fate,']

# Writing to Files
Python allows you to write content to a file in a way similar to how the `print()` function “writes” strings to the screen. You can’t write to a file you’ve opened in read mode, though. Instead, you need to open it in “write plaintext” mode or “append plaintext” mode, or write mode and append mode for short.

* Pass `'a'` as the second argument to `open()` the file in append mode. Append mode will append text to the end of the existing file.
* Pass `'w'` as the second argument to `open()` to open the file in write mode. Write mode will overwrite the existing file and start from scratch.

If the finename passed to `open()` does not exist, both write and append mode will create a new, blank file.

Call the `close()` method before opening the file again.

In [46]:
# Example 1
baconFile = open('bacon.txt','w')
baconFile.write('Hello world!\n')
baconFile.close()

In [47]:
# Example 1
with open('bacon.txt','w') as baconFile:
    baconFile.write('Hello world!\n')

In [48]:
baconFile = open('bacon.txt','a')
baconFile.write('Bacon is not a vegetable.')
baconFile.close()

In [49]:
baconFile = open('bacon.txt')
content = baconFile.read()
baconFile.close()
print(content)

Hello world!
Bacon is not a vegetable.
