## Managing Folders and Files using Python

Let us quickly recap details about folders and files, especially using Linux. You need to be comfortable with the following.
* Differentiating Files and Folders. Keep in mind that Folders and Directories means the samething.
* Understanding Absolute or Fully Qualified Path.
* Understanding Relative Path.
* Understanding File or Folder permissions.

We will first see how we can leverage subprocess to run Linux commands to manage files and then we will go through the above topics.

We will be running the following commands using Python `subprocess` module. That way, you will also pick up the relevance of `subprocess` as well.

```shell
# Listing files and folders under /data/retail_db
ls -ltr /data/retail_db

# Listing files in the present working directory
ls -ltr

# Listing files in the home directory. ~ represents home directory.
# You can find the ~ key on the left side of 1 in most of the keyboards.
ls -ltr ~
```

In [1]:
import subprocess

In [4]:
# Output is not reader friendly.
# Output is of type bytes
subprocess.check_output('ls -ltr /data/retail_db', shell=True)

b'total 20156\n-rw-r--r-- 1 root root      806 Jan 21  2021 README.md\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 products\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 orders\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 order_items\n-rw-r--r-- 1 root root 10297372 Jan 21  2021 load_db_tables_pg.sql\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 departments\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 customers\n-rw-r--r-- 1 root root     1748 Jan 21  2021 create_db_tables_pg.sql\n-rw-r--r-- 1 root root 10303297 Jan 21  2021 create_db.sql\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 categories\n'

In [5]:
# We can decode to string and apply string functions
# Now the output is of type string
subprocess.check_output('ls -ltr /data/retail_db', shell=True).decode('utf-8')

'total 20156\n-rw-r--r-- 1 root root      806 Jan 21  2021 README.md\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 products\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 orders\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 order_items\n-rw-r--r-- 1 root root 10297372 Jan 21  2021 load_db_tables_pg.sql\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 departments\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 customers\n-rw-r--r-- 1 root root     1748 Jan 21  2021 create_db_tables_pg.sql\n-rw-r--r-- 1 root root 10303297 Jan 21  2021 create_db.sql\ndrwxr-xr-x 2 root root     4096 Jan 21  2021 categories\n'

In [6]:
# We can use splitlines to convert this big string into list of strings.
# splitlines will use new line character as the delimiter.
subprocess.check_output('ls -ltr /data/retail_db', shell=True).decode('utf-8').splitlines()

['total 20156',
 '-rw-r--r-- 1 root root      806 Jan 21  2021 README.md',
 'drwxr-xr-x 2 root root     4096 Jan 21  2021 products',
 'drwxr-xr-x 2 root root     4096 Jan 21  2021 orders',
 'drwxr-xr-x 2 root root     4096 Jan 21  2021 order_items',
 '-rw-r--r-- 1 root root 10297372 Jan 21  2021 load_db_tables_pg.sql',
 'drwxr-xr-x 2 root root     4096 Jan 21  2021 departments',
 'drwxr-xr-x 2 root root     4096 Jan 21  2021 customers',
 '-rw-r--r-- 1 root root     1748 Jan 21  2021 create_db_tables_pg.sql',
 '-rw-r--r-- 1 root root 10303297 Jan 21  2021 create_db.sql',
 'drwxr-xr-x 2 root root     4096 Jan 21  2021 categories']

In [7]:
# We can iterate through the list and print one item at a time.
# Now the output will be reader friendly
output = subprocess.check_output('ls -ltr /data/retail_db', shell=True).decode('utf-8').splitlines()
for line in output: print(line)

total 20156
-rw-r--r-- 1 root root      806 Jan 21  2021 README.md
drwxr-xr-x 2 root root     4096 Jan 21  2021 products
drwxr-xr-x 2 root root     4096 Jan 21  2021 orders
drwxr-xr-x 2 root root     4096 Jan 21  2021 order_items
-rw-r--r-- 1 root root 10297372 Jan 21  2021 load_db_tables_pg.sql
drwxr-xr-x 2 root root     4096 Jan 21  2021 departments
drwxr-xr-x 2 root root     4096 Jan 21  2021 customers
-rw-r--r-- 1 root root     1748 Jan 21  2021 create_db_tables_pg.sql
-rw-r--r-- 1 root root 10303297 Jan 21  2021 create_db.sql
drwxr-xr-x 2 root root     4096 Jan 21  2021 categories


In [8]:
subprocess.check_output('ls -ltr', shell=True).decode('utf-8').splitlines()

['total 116',
 '-rw-r--r-- 1 itv002461 students  1307 Apr 26 05:40 01_basics_of_file_io_using_python.ipynb',
 '-rw-r--r-- 1 itv002461 students  1686 Apr 26 05:41 02_overview_of_file_io.ipynb',
 '-rw-r--r-- 1 itv002461 students  9591 Apr 26 06:43 03_folders_and_files.ipynb',
 '-rw-r--r-- 1 itv002461 students  3939 Apr 26 07:04 04_file_paths_and_names.ipynb',
 '-rw-r--r-- 1 itv002461 students  5854 Apr 26 07:08 05_overview_of_retail_data.ipynb',
 '-rw-r--r-- 1 itv002461 students 14046 Apr 26 07:10 06_read_text_file_into_string.ipynb',
 '-rw-r--r-- 1 itv002461 students  7371 Apr 26 07:18 07_write_string_to_text_file.ipynb',
 '-rw-r--r-- 1 itv002461 students 10362 Apr 26 07:22 08_overview_of_modes_to_write_into_files.ipynb',
 '-rw-r--r-- 1 itv002461 students  4236 Apr 26 07:23 09_overview_of_delimited_strings.ipynb',
 '-rw-r--r-- 1 itv002461 students 13130 Apr 26 07:33 10_read_csv_into_list_of_strings.ipynb',
 'drwxr-xr-x 2 itv002461 students  4096 Apr 26 07:39 data',
 '-rw-r--r-- 1 itv002

In [9]:
!pwd

/home/itv002461/data-engineering-spark/itversity-material/01-python-and-sql/20_basics_of_file_io_using_python


In [10]:
subprocess.check_output('ls -l', shell=True).decode('utf-8').splitlines()

['total 116',
 '-rw-r--r-- 1 itv002461 students  1307 Apr 26 05:40 01_basics_of_file_io_using_python.ipynb',
 '-rw-r--r-- 1 itv002461 students  1686 Apr 26 05:41 02_overview_of_file_io.ipynb',
 '-rw-r--r-- 1 itv002461 students  9591 Apr 26 06:43 03_folders_and_files.ipynb',
 '-rw-r--r-- 1 itv002461 students  3939 Apr 26 07:04 04_file_paths_and_names.ipynb',
 '-rw-r--r-- 1 itv002461 students  5854 Apr 26 07:08 05_overview_of_retail_data.ipynb',
 '-rw-r--r-- 1 itv002461 students 14046 Apr 26 07:10 06_read_text_file_into_string.ipynb',
 '-rw-r--r-- 1 itv002461 students  7371 Apr 26 07:18 07_write_string_to_text_file.ipynb',
 '-rw-r--r-- 1 itv002461 students 10362 Apr 26 07:22 08_overview_of_modes_to_write_into_files.ipynb',
 '-rw-r--r-- 1 itv002461 students  4236 Apr 26 07:23 09_overview_of_delimited_strings.ipynb',
 '-rw-r--r-- 1 itv002461 students 13130 Apr 26 07:33 10_read_csv_into_list_of_strings.ipynb',
 '-rw-r--r-- 1 itv002461 students 10216 Apr 26 07:42 11_write_strings_to_file_in_

In [11]:
subprocess.check_output('ls -ltr ~', shell=True).decode('utf-8').splitlines()

['total 4',
 'drwxr-xr-x 9 itv002461 students 4096 Apr  6 01:18 data-engineering-spark']

In [12]:
subprocess.check_output('ls -ltr data', shell=True).decode('utf-8').splitlines()

['total 20',
 '-rw-r--r-- 1 itv002461 students  41 Apr 26 07:18 sample_data.txt',
 '-rw-r--r-- 1 itv002461 students  27 Apr 26 07:20 overwrite.txt',
 '-rw-r--r-- 1 itv002461 students  54 Apr 26 07:20 append.txt',
 '-rw-r--r-- 1 itv002461 students  27 Apr 26 07:21 new_file.txt',
 '-rw-r--r-- 1 itv002461 students 119 Apr 26 07:40 departments.txt']