# Using the OS Module

<span>This notebook is a combinations of little snippets of Python code from Python's OS module that can I found useful for a variety of tasks. These tasks include remove files from directories, moving files from directories, parsing content from notebook, appending content to files, changing file types, etc. Hopefully you find some useful code below.</span>

### Import Preliminaries

In [1]:
import os

<br><br>
### Get Current Directory

In [2]:
cwd = os.getcwd()
cwd

'/Users/kavi/Documents/DataScience/Pipelines'

<br><br>
### Find Path Function

Search the given directory and it's subdirectory  for the first instance a specific file. Return the file path of this file.

In [3]:
def find_path(name, path):
    '''
    Search the given directory for the first instance a specific file. 
    Return the file path of this file.
    '''
    for root, dirs, files in os.walk(path):
        if name in files:
            return os.path.join(root, name)

In [4]:
# Run our Find Path Punction
find_path('10-15-17 Rescaling Features.ipynb',
          '/Users/Kavi/Documents/DataScience')

'/Users/Kavi/Documents/DataScience/Guides/10-15-17 Rescaling Features.ipynb'

<br><br>
### Breaking Down the Find Path Function

The `os.walk(path)` function is pretty cool. Let's take a moment to break down each variable this function returns the `root` variable is the diretory we are searching, `dirs` are the recursive subdirectories that the function is searching, and `files` are the files that exist in the directory. 

In [5]:
# Breaking Down this Function
path = '/Users/Kavi/Documents/DataScience'
name = 'README.md'
result = []
for root, dirs, files in os.walk(path):
    print('\n\n'+'-'*15)
    print('Root:',root)
    print('\n\n'+'-'*15)
    print('Dirs:',dirs)
    print('\n\n'+'-'*15)
    print('Files:',files)
    print('\n\n'+'-'*15)
    print(os.walk(path))
    break



---------------
Root: /Users/Kavi/Documents/DataScience


---------------
Dirs: ['Predictive Analysis', 'Kaggle', 'FlashCards', 'Whitepapers', 'Tesitng', 'Competitions', 'Descriptive Analysis', 'Pipelines', 'Books', 'Stack Overflow', 'Visualizations', 'Notes', 'Reading list', 'Guides', 'Economics', 'Tutorials', '.ipynb_checkpoints', '.git', 'BrainStation', 'Community', 'NoSQL', 'Interviews', 'Techniques', 'SQL']


---------------
Files: ['.DS_Store', 'Portfolio.md', 'README.md', '.gitignore']


---------------
<generator object walk at 0x106613570>


<br><br>
### Find All Paths Function

Search the given directory and it's subdirectory for the every instance of a specific file. Return the all file paths of this file as a list.

In [6]:
def find_all_paths(name, path):
    '''
    Search the given directory and it's subdirectory for the every 
    instance of a specific file. Return the all file paths of this 
    file as a list.
    '''
    result = []
    for root, dirs, files in os.walk(path):
        if name in files:
            # os.path.join(root, name) is a string
            result.append(os.path.join(root, name)) 
            
    return result

In [7]:
# Run our Find All Paths Punction
find_all_paths('README.md','/Users/Kavi/Documents/DataScience')

['/Users/Kavi/Documents/DataScience/README.md',
 '/Users/Kavi/Documents/DataScience/Predictive Analysis/README.md',
 '/Users/Kavi/Documents/DataScience/Competitions/README.md',
 '/Users/Kavi/Documents/DataScience/Competitions/DonorsChoose Application/README.md',
 '/Users/Kavi/Documents/DataScience/Descriptive Analysis/README.md',
 '/Users/Kavi/Documents/DataScience/Pipelines/AWS Pipelines/README.md',
 '/Users/Kavi/Documents/DataScience/Notes/Readings/README.md',
 '/Users/Kavi/Documents/DataScience/Reading list/README.md',
 '/Users/Kavi/Documents/DataScience/Tutorials/README.md',
 '/Users/Kavi/Documents/DataScience/Tutorials/Tutorial - Luigi/data-engineering-101-master/README.md',
 '/Users/Kavi/Documents/DataScience/Tutorials/Tutorial - Luigi/data-engineering-101-master/topmodel/README.md',
 '/Users/Kavi/Documents/DataScience/Community/README.md',
 '/Users/Kavi/Documents/DataScience/Techniques/README.md',
 '/Users/Kavi/Documents/DataScience/SQL/README.md']

<br><br>
### Find All Path for a list of Files
Search the given directory and it's subdirectory for the every instance of ever file in a list. Return the all file paths for every file as a list.

In [8]:
def find_all_paths_in_list(list_of_files, path):
    '''
    Search the given directory and it's subdirectory for the every 
    instance of ever file in a list. Return the all file paths for 
    every file as a list.
    '''
    result = []
    for name in list_of_files:
        for root, dirs, files in os.walk(path):
            if name in files:
                # os.path.join(root, name) is a string
                result.append(os.path.join(root, name))
                
    return result

In [9]:
# Generating a list of files
list_of_files = ['README.md',
                 '02-01-17 4 Time Saving Tricks in Pandas.ipynb']

# Run our Find All Paths in List Punction
find_all_paths_in_list(list_of_files,'/Users/Kavi/Documents/DataScience')

['/Users/Kavi/Documents/DataScience/README.md',
 '/Users/Kavi/Documents/DataScience/Predictive Analysis/README.md',
 '/Users/Kavi/Documents/DataScience/Competitions/README.md',
 '/Users/Kavi/Documents/DataScience/Competitions/DonorsChoose Application/README.md',
 '/Users/Kavi/Documents/DataScience/Descriptive Analysis/README.md',
 '/Users/Kavi/Documents/DataScience/Pipelines/AWS Pipelines/README.md',
 '/Users/Kavi/Documents/DataScience/Notes/Readings/README.md',
 '/Users/Kavi/Documents/DataScience/Reading list/README.md',
 '/Users/Kavi/Documents/DataScience/Tutorials/README.md',
 '/Users/Kavi/Documents/DataScience/Tutorials/Tutorial - Luigi/data-engineering-101-master/README.md',
 '/Users/Kavi/Documents/DataScience/Tutorials/Tutorial - Luigi/data-engineering-101-master/topmodel/README.md',
 '/Users/Kavi/Documents/DataScience/Community/README.md',
 '/Users/Kavi/Documents/DataScience/Techniques/README.md',
 '/Users/Kavi/Documents/DataScience/SQL/README.md',
 '/Users/Kavi/Documents/DataSc

<br><br>
### Importing Data From Text File

In [10]:
file = open('notebooks.txt','r')
file.read()

'17-08-01 Cross Validation and K-Folds.ipynb\n18-07-03 Writing a File to AWS S3.ipynb\n18-07-03 Reading a File from AWS S3.ipynb\n18-07-03 Connecting to a Local Database.ipynb\n18-07-03 Connecting to a Local Database with SHH.ipynb\n17-11-07 Dimensional Pivot Table.ipynb\n18-06-06 Every Matplotlib Plot Linestyle.ipynb\n18-06-06 Every Matplotlib Plot Marker.ipynb\n17-08-05 Histograms.ipynb\n15-02-02 Barplots.ipynb\n17-10-16 Wordclouds in Python.ipynb\n17-12-04 Heatmaps.ipynb\n18-07-07 Styling DataFrames.ipynb\n18-07-07 Resampling Datetime.ipynb\n18-07-24 Converting Notebook to Slides.ipynb\n18-07-24 Label Encoding.ipynb\n17-10-15 Plotting Residuals.ipynb\n17-10-15 Correlation Matricesipynb\n18-07-04 Select Dtypes.ipynb\n18-03-28 Creating Dummy Variables.ipynb\n17-10-09 Standardizations.ipynb\n17-08-01 Random Grid Search.ipynb\n17-08-01 Full Grid Search.ipynb\n18-07-28 Notebook Snippets.ipynb\n18-07-29 Iris Analysis.ipynb\n18-07-29 Classification Models.ipynb\n18-07-24 Binning Features.i

<br><br>
### Import Data From Text FIle into a List

In [11]:
# First Method
file = open('Data/notebooks.txt','r', encoding="utf-8")
file.readlines()

['17-08-01 Cross Validation and K-Folds.ipynb\n',
 '18-07-03 Writing a File to AWS S3.ipynb\n',
 '18-07-03 Reading a File from AWS S3.ipynb\n',
 '18-07-03 Connecting to a Local Database.ipynb\n',
 '18-07-03 Connecting to a Local Database with SHH.ipynb\n',
 '17-11-07 Dimensional Pivot Table.ipynb\n',
 '18-06-06 Every Matplotlib Plot Linestyle.ipynb\n',
 '18-06-06 Every Matplotlib Plot Marker.ipynb\n',
 '17-08-05 Histograms.ipynb\n',
 '15-02-02 Barplots.ipynb\n',
 '17-10-16 Wordclouds in Python.ipynb\n',
 '17-12-04 Heatmaps.ipynb\n',
 '18-07-07 Styling DataFrames.ipynb\n',
 '18-07-07 Resampling Datetime.ipynb\n',
 '18-07-24 Converting Notebook to Slides.ipynb\n',
 '18-07-24 Label Encoding.ipynb\n',
 '17-10-15 Plotting Residuals.ipynb\n',
 '17-10-15 Correlation Matricesipynb\n',
 '18-07-04 Select Dtypes.ipynb\n',
 '18-03-28 Creating Dummy Variables.ipynb\n',
 '17-10-09 Standardizations.ipynb\n',
 '17-08-01 Random Grid Search.ipynb\n',
 '17-08-01 Full Grid Search.ipynb\n',
 '18-07-28 Note

<br><br>
### Import Data From Text FIle into a List

In [12]:
# Second Method
with open('Data/notebooks.txt','r', encoding="utf-8") as f:
    mylist = f.read().splitlines()
mylist

['17-08-01 Cross Validation and K-Folds.ipynb',
 '18-07-03 Writing a File to AWS S3.ipynb',
 '18-07-03 Reading a File from AWS S3.ipynb',
 '18-07-03 Connecting to a Local Database.ipynb',
 '18-07-03 Connecting to a Local Database with SHH.ipynb',
 '17-11-07 Dimensional Pivot Table.ipynb',
 '18-06-06 Every Matplotlib Plot Linestyle.ipynb',
 '18-06-06 Every Matplotlib Plot Marker.ipynb',
 '17-08-05 Histograms.ipynb',
 '15-02-02 Barplots.ipynb',
 '17-10-16 Wordclouds in Python.ipynb',
 '17-12-04 Heatmaps.ipynb',
 '18-07-07 Styling DataFrames.ipynb',
 '18-07-07 Resampling Datetime.ipynb',
 '18-07-24 Converting Notebook to Slides.ipynb',
 '18-07-24 Label Encoding.ipynb',
 '17-10-15 Plotting Residuals.ipynb',
 '17-10-15 Correlation Matricesipynb',
 '18-07-04 Select Dtypes.ipynb',
 '18-03-28 Creating Dummy Variables.ipynb',
 '17-10-09 Standardizations.ipynb',
 '17-08-01 Random Grid Search.ipynb',
 '17-08-01 Full Grid Search.ipynb',
 '18-07-28 Notebook Snippets.ipynb',
 '18-07-29 Iris Analysis

<br><br>
### Copying Files to Another Directory

In [13]:
from shutil import copyfile

file_paths = ['/Users/Kavi/Documents/DataScience/README.md']

for filepaths in file_paths:
  copyfile(filepaths, '/Users/Kavi/Documents/DataScience/Pipelines/Data/copy_README.md')

<br><br>
### Writing a File to Directory

In [14]:
with open('Data/sample_file.txt','w') as f2:
    f2.write('sample text')
    #Close you filse to save memory when iterating files
    f2.close()

<br><br>
### Appending Data to a File and Saving it Again

In [15]:
with open('Data/sample_file.txt','r') as f1:
    with open('Data/sample_file2.txt','w') as f2: 
        # Copy the text from the first file 
        f2.write(f1.read())
        f2.write("\n You are the second version of the same file")
        f2.close()
    f1.close()

<br><br>
### Deleting Data in a Diretory

In [16]:
# Remove Text file from the directory
os.remove("Data/sample_file2.txt")
os.remove("Data/sample_file.txt")

Author: Kavi Sekhon