#OS Module in Python
- The `os` module in Python provides functions for interacting with the operating system. 
- `os` comes under Python’s standard utility modules. 
- This module provides methods for interacting with the operating system, like creating files and directories, management of files and directories, input, output, environment variables, process management, etc.

###Python-OS-Module Functions
The *os* and *os.path* modules include many functions to interact with the file system.


Here we will discuss some important functions of the Python os module :

- Handling the Current Working Directory
- Creating a Directory
- Listing out Files and Directories with Python
- Deleting Directory or Files using Python

####Handling the Current Working Directory
- The current working directory is the directory in which a Python script is running. 
- Whenever the files are called only by their name, Python assumes that it starts in the CWD which means that name-only reference will be successful only if the file is in the Python’s CWD.

In [0]:
# To get the location of the current working directory os.getcwd() is used.
import os

cwd = os.getcwd()
print(cwd)

/databricks/driver


**Changing the Current working directory**
- Use `os.chdir(path)` to change the current working directory to the specified path.
-  It only takes a single argument as a new directory path.

In [0]:
import os

print("Before changing the current working directory:", os.getcwd())

os.chdir("/databricks/driver/rohish/")
print("after changing the current working directory:", os.getcwd())

Before changing the current working directory: /databricks/driver
after changing the current working directory: /databricks/driver/rohish


####Creating a Directory
There are different methods available in the OS module for creating a directory. These are –
- os.mkdir()
- os.makedirs()

#####Creating a Single Directory:

Use `os.mkdir(path)` to create a single directory. If the directory already exists, it will raise a FileExistsError.


In [0]:
import os
os.mkdir("/databricks/driver/rohish/new_dir/")

# checking the directory
dbutils.fs.ls("file:/databricks/driver/rohish/") # this is databricks command and will not work in python

Out[11]: [FileInfo(path='file:/databricks/driver/rohish/new_dir/', name='new_dir/', size=4096, modificationTime=1722319559597)]

#####Creating Multiple Directories:

Use `os.makedirs(path)` to create a directory and any necessary intermediate directories. If any of the directories already exist, it will raise a FileExistsError.

In [0]:
import os
os.makedirs("/databricks/driver/rohish/parent_directory/child_directory")

# checking the directories
dbutils.fs.ls("file:/databricks/driver/rohish/parent_directory") # this is databricks command and will not work in python

Out[13]: [FileInfo(path='file:/databricks/driver/rohish/parent_directory/child_directory/', name='child_directory/', size=4096, modificationTime=1722319705070)]

####Listing Out Files and Directories with Python

- The `os.listdir()` method in Python is used to get the list of all files and directories in the specified directory. 
- If we don’t specify any directory, then the list of files and directories in the current working directory will be returned.

In [0]:
import os

print(os.getcwd())
dir_list = os.listdir()
print(dir_list)

/databricks/driver
['preload_class.lst', 'hadoop_accessed_config.lst', 'conf', 'azure', 'logs', 'eventlogs']


In [0]:
#  This code lists all the files and directories in the root directory (“/”).

path = "/"
dir_list = os.listdir(path) 
print("Files and directories in '", path, "' :") 
print(dir_list) 

Files and directories in ' / ' :
['mnt', 'BUILD', 'libx32', 'run', 'root', 'tmp', 'databricks', 'bin', 'boot', 'media', 'home', 'lib', 'sys', 'etc', 'sbin', 'proc', 'dev', 'var', 'usr', 'lib32', 'srv', 'lib64', 'opt', 'dbfs', 'Volumes', 'local_disk0', 'Workspace']


####Deleting Directory or Files using Python
OS module provides different methods for removing directories and files in Python. These are – 

- Using `os.remove()`
- Using `os.rmdir()`

#####Using os.remove() Method
- `os.remove()` method in Python is used to remove or delete a file path. 
- This method can not remove or delete a directory. 
- If the specified path is a directory then OSError will be raised by the method.

In [0]:
# checking the directories and files/. this is databricks code and will not work in Python
display(dbutils.fs.ls("file:/databricks/driver/rohish/"))

path,name,size,modificationTime
file:/databricks/driver/rohish/file2.txt,file2.txt,13,1722407818900
file:/databricks/driver/rohish/file1.txt,file1.txt,13,1722407801188


In [0]:
# This code removes a file named “file1.txt” from the specified location
import os

location = "/databricks/driver/rohish/"
file_name = "file1.txt"

os.remove(location+file_name)

In [0]:
# file “file1.txt” removed from the location
display(dbutils.fs.ls("file:/databricks/driver/rohish/"))

path,name,size,modificationTime
file:/databricks/driver/rohish/file2.txt,file2.txt,13,1722407818900


#####Using os.rmdir()
- `os.rmdir()` method in Python is used to remove or delete an empty directory. 
- OSError will be raised if the specified path is not an empty directory.

In [0]:
# listing directories before deleting
display(dbutils.fs.ls("file:/databricks/driver/rohish/"))

path,name,size,modificationTime
file:/databricks/driver/rohish/file2.txt,file2.txt,13,1722407818900
file:/databricks/driver/rohish/dir1/,dir1/,4096,1722408329829
file:/databricks/driver/rohish/dir2/,dir2/,4096,1722408334361


In [0]:
# This code removes a file named “file1.txt” from the specified location
import os

location = "/databricks/driver/rohish/"
dir_name = "dir2"

os.rmdir(location+dir_name)

In [0]:
# directory dir1 is removed from the location
display(dbutils.fs.ls("file:/databricks/driver/rohish/"))

path,name,size,modificationTime
file:/databricks/driver/rohish/file2.txt,file2.txt,13,1722407818900
file:/databricks/driver/rohish/dir1/,dir1/,4096,1722408329829


### os.path
- The os.path module in Python is a handy tool for manipulating file and directory paths. 
- It provides several functions that can be utilized to perform common tasks related to file paths.

#####Joining Paths with os.path.join

The function `os.path.join()` is used to combine one or more path names into a single path.

In [0]:
import os

path = os.path.join('/rohish', 'user', 'documents')
print(path)

/rohish/user/documents


#####Getting the Base Name with os.path.basename
- The `os.path.basename()` function returns the base name of the pathname path. 
- This is the second element of the pair returned by passing path to the function `os.path.split()`.

In [0]:
# In this example, os.path.basename() returns the base name rohish from the path string.
import os

path = '/databricks/driver/rohish'
basepath = os.path.basename(path)
print(basepath)

rohish


#####Splitting and Joining Paths
- The os.path module provides functions to split a pathname into a pair (head, tail) and to join two or more pathname components
- `os.path.split()` splits the path into two parts: the head (/home/user/documents) and the tail (myfile.txt). This can be useful when you need to manipulate individual parts of a path.

In [0]:
import os

path = "/databricks/driver/rohish/file2.txt"
head, tail = os.path.split(path)

print(head)
print(tail)

print(os.path.split("/databricks/driver/rohish/file2.txt")[0])

/databricks/driver/rohish
file2.txt
/databricks/driver/rohish


#####os.path.exists(path): 
Returns True if the specified path exists, False otherwise.

In [0]:
import os

path = "/databricks/driver/rohish/file2.txt"

if os.path.exists(path):
    print("file2.txt exists")
else:
    print("file2.txt does not exists")

file2.txt exists


#####os.path.isfile(path):
Returns True if the specified path is an existing regular file, False otherwise.

In [0]:
import os

path = "/databricks/driver/rohish/file2.txt"
path1 = "/databricks/driver/rohish/zade/"

if os.path.exists(path):
    print("file2.txt is a file")
else:
    print("file2.txt is not a file")

if os.path.exists(path1):
    print("zade is a file")
else:
    print("zade is not a file")

file2.txt is a file
zade is a file


#####os.path.isdir(path): 
Returns True if the specified path is an existing directory, False otherwise.

In [0]:
import os

path = "/databricks/driver/rohish/file2.txt"

if os.path.exists(path):
    print("path is a directory")
else:
    pprint("path is not a directory")

path is a directory


#####os.path.dirname(path):
Returns the directory name of the pathname path. 

This is the first half of the pair returned by os.path.split(path).

In [0]:
dirname = "/databricks/driver/rohish/file2.txt"
print(os.path.dirname(dirname))

/databricks/driver/rohish


#####os.path.splitext(path):
Splits the pathname path into a pair (root, ext) where root is everything before the last dot and ext is everything after the last dot.

In [0]:
import os

path = "/databricks/driver/rohish/file2.txt"

root, ext = os.path.splitext(path)

print(root)
print(ext)


/databricks/driver/rohish/file2
.txt


#####os.path.abspath(path):
Returns a normalized absolute version of the pathname path

In [0]:
import os

file = "file4.txt"

print(os.path.abspath(file))

/databricks/driver/file4.txt


#####os.path.getsize(path): 
Returns the size, in bytes, of the specified path.



In [0]:
import os

path = "/databricks/driver/rohish/file2.txt"

print(os.path.getsize(path))

13


#####os.path.getmtime(path): 
Returns the time of last modification of the path.

In [0]:
import os
import time

path = "/databricks/driver/rohish/file2.txt"

print(os.path.getmtime(path))
print(time.ctime(os.path.getmtime(path)))

1722861034.142767
Mon Aug  5 12:30:34 2024


- **os.path.getatime(path):** Returns the time of last access of the path.
- **os.path.getctime(path):** Returns the time of creation of the path.

####Example Usage
Example that demonstrates some of the above functions:

In [0]:
import os

# Joining paths
path = os.path.join('/databricks', 'driver', 'rohish', 'file2.txt') 
print("Joined path:", path)

# Checking if a path exists
if os.path.exists(path):
    print(f"{path} exists")

# Checking if it's a file or directory
if os.path.isfile(path):
    print(f"{path} is a file")
elif os.path.isdir(path):
    print(f"{path} is a directory")

# Getting the basename and dirname
basename = os.path.basename(path)
dirname = os.path.dirname(path)
print("Base name:", basename)
print("Directory name:", dirname)

# Splitting the path
head, tail = os.path.split(path)
print("Head:", head)
print("Tail:", tail)

# Getting the absolute path
abs_path = os.path.abspath(path)
print("Absolute path:", abs_path)

# Getting file size
size = os.path.getsize(path)
print("File size:", size, "bytes")

# Getting file modification time
mtime = os.path.getmtime(path)
print("Last modification time:", time.ctime(mtime))

# Getting file access time
atime = os.path.getatime(path)
print("Last access time:", time.ctime(atime))

# Getting file creation time
ctime = os.path.getctime(path)
print("Creation time:", time.ctime(ctime))


Joined path: /databricks/driver/rohish/file2.txt
/databricks/driver/rohish/file2.txt exists
/databricks/driver/rohish/file2.txt is a file
Base name: file2.txt
Directory name: /databricks/driver/rohish
Head: /databricks/driver/rohish
Tail: file2.txt
Absolute path: /databricks/driver/rohish/file2.txt
File size: 13 bytes
Last modification time: Mon Aug  5 12:30:34 2024
Last access time: Mon Aug  5 12:30:34 2024
Creation time: Mon Aug  5 12:30:34 2024
