# os.path - Platform-independent Manipulation of Filenames

Purpose:	Parse, build, test, and otherwise work on filenames and paths.

Writing code to work with files on multiple platforms is easy using the functions included in the os.path module. Even programs not intended to be ported between platforms should use os.path for reliable filename parsing.

## Parsing Paths

The first set of functions in os.path can be used to parse strings representing filenames into their component parts. It is important to realize that these functions do not depend on the paths actually existing; they operate solely on the strings.

Path parsing depends on a few variable defined in os:

* os.sep - The separator between portions of the path (e.g., “/” or “\”).
* os.extsep - The separator between a filename and the file “extension” (e.g., “.”).
* os.pardir - The path component that means traverse the directory tree up one level (e.g., “..”).
* os.curdir - The path component that refers to the current directory (e.g., “.”).

### split()
The split() function breaks the path into two separate parts and returns a tuple with the results. The second element of the tuple is the last component of the path, and the first element is everything that comes before it.

In [5]:
# ospath_split.py
import os.path

PATHS = [
    '/one/two/three',
    '/one/two/three/',
    '/',
    '.',
    '',
]

for path in PATHS:
    print('{!r:>17} : {}'.format(path, os.path.split(path)))

 '/one/two/three' : ('/one/two', 'three')
'/one/two/three/' : ('/one/two/three', '')
              '/' : ('/', '')
              '.' : ('', '.')
               '' : ('', '')


### basename()
The basename() function returns a value equivalent to the second part of the split() value.

In [6]:
# ospath_basename.py
import os.path

PATHS = [
    '/one/two/three',
    '/one/two/three/',
    '/',
    '.',
    '',
]

for path in PATHS:
    print('{!r:>17} : {!r}'.format(path, os.path.basename(path)))

 '/one/two/three' : 'three'
'/one/two/three/' : ''
              '/' : ''
              '.' : '.'
               '' : ''


The full path is stripped down to the last element, whether that refers to a file or directory. If the path ends in the directory separator (os.sep), the base portion is considered to be empty.

### dirname()
The dirname() function returns the first part of the split path:

In [7]:
# ospath_dirname.py
import os.path

PATHS = [
    '/one/two/three',
    '/one/two/three/',
    '/',
    '.',
    '',
]

for path in PATHS:
    print('{!r:>17} : {!r}'.format(path, os.path.dirname(path)))

 '/one/two/three' : '/one/two'
'/one/two/three/' : '/one/two/three'
              '/' : '/'
              '.' : ''
               '' : ''


### splitext()
splitext() works like split(), but divides the path on the extension separator, rather than the directory separator.

In [11]:
# ospath_splitext.py
import os.path

PATHS = [
    'filename.txt',
    'filename',
    '/path/to/filename.txt',
    '/',
    '',
    'my-archive.tar.gz',
    'no-extension.',
]

for path in PATHS:
    print('{!r:>21} : {!r}'.format(path, os.path.splitext(path)))

       'filename.txt' : ('filename', '.txt')
           'filename' : ('filename', '')
'/path/to/filename.txt' : ('/path/to/filename', '.txt')
                  '/' : ('/', '')
                   '' : ('', '')
  'my-archive.tar.gz' : ('my-archive.tar', '.gz')
      'no-extension.' : ('no-extension', '.')


Only the last occurrence of os.extsep is used when looking for the extension, so if a filename has multiple extensions the results of splitting it leaves part of the extension on the prefix.

### commonprefix()
commonprefix() takes a list of paths as an argument and returns a single string that represents a common prefix present in all of the paths. The value may represent a path that does not actually exist, and the path separator is not included in the consideration, so the prefix might not stop on a separator boundary.

In [12]:
# ospath_commonprefix.py
import os.path

paths = ['/one/two/three/four',
         '/one/two/threefold',
         '/one/two/three/',
         ]
for path in paths:
    print('PATH:', path)

print()
print('PREFIX:', os.path.commonprefix(paths))

PATH: /one/two/three/four
PATH: /one/two/threefold
PATH: /one/two/three/

PREFIX: /one/two/three


### commonpath()
commonpath() does honor path separators, and returns a prefix that does not include partial path values.

In [13]:
# ospath_commonpath.py
import os.path

paths = ['/one/two/three/four',
         '/one/two/threefold',
         '/one/two/three/',
         ]
for path in paths:
    print('PATH:', path)

print()
print('PREFIX:', os.path.commonpath(paths))

PATH: /one/two/three/four
PATH: /one/two/threefold
PATH: /one/two/three/

PREFIX: /one/two


## Building Paths

Besides taking existing paths apart, it is frequently necessary to build paths from other strings. 

### join()
To combine several path components into a single value, use join():    

In [16]:
# ospath_join.py
import os.path

PATHS = [
    ('one', 'two', 'three'),
    ('/', 'one', 'two', 'three'),
    ('/one', '/two', '/three'),
]

for parts in PATHS:
    print('{} : {!r}'.format(parts, os.path.join(*parts)))

('one', 'two', 'three') : 'one/two/three'
('/', 'one', 'two', 'three') : '/one/two/three'
('/one', '/two', '/three') : '/three'


If any argument to join begins with os.sep, all of the previous arguments are discarded and the new one becomes the beginning of the return value.

### expanduser() 
It is also possible to work with paths that include “variable” components that can be expanded automatically. For example, expanduser() converts the tilde (~) character to the name of a user’s home directory.

If the user’s home directory cannot be found, the string is returned unchanged, as with ~nosuchuser in this example.

In [20]:
# ospath_expanduser.py
import os.path

for user in ['', 'dhellmann', 'nosuchuser']:
    lookup = '~' + user
    print('{!r:>15} : {!r}'.format(
        lookup, os.path.expanduser(lookup)))

            '~' : '/Users/binyang'
   '~dhellmann' : '~dhellmann'
  '~nosuchuser' : '~nosuchuser'


### expandvars()
expandvars() is more general, and expands any shell environment variables present in the path.

In [26]:
# ospath_expandvars.py
import os.path
import os

os.environ['MYVAR'] = 'VALUE'

print(os.path.expandvars('/path/to/$MYVAR'))

/path/to/VALUE


In [24]:
! env | grep MYVAR

MYVAR=VALUE


## Normalizing Paths

### normpath()
Paths assembled from separate strings using join() or with embedded variables might end up with extra separators or relative path components. Use normpath() to clean them up:

In [33]:
# ospath_normpath.py
import os.path

PATHS = [
    'one//two//three',
    'one/./two/./three',
    'one/../alt/two/three',
]

for path in PATHS:
    print('{!r:>30} : {!r}'.format(path, os.path.normpath(path)))

             'one//two//three' : 'one/two/three'
           'one/./two/./three' : 'one/two/three'
        'one/../alt/two/three' : 'alt/two/three'


### abspath()
To convert a relative path to an absolute filename, use abspath().

In [34]:
# ospath_abspath.py
import os
import os.path

os.chdir('/usr')

PATHS = [
    '.',
    '..',
    './one/two/three',
    '../one/two/three',
]

for path in PATHS:
    print('{!r:>21} : {!r}'.format(path, os.path.abspath(path)))

                  '.' : '/usr'
                 '..' : '/'
    './one/two/three' : '/usr/one/two/three'
   '../one/two/three' : '/one/two/three'


## File Times

Besides working with paths, os.path includes functions for retrieving file properties, similar to the ones returned by os.stat():

In [44]:
# ospath_properties.py
import os.path
import time

try:
    print('File         :', __file__)
    print('Access time  :', time.ctime(os.path.getatime(__file__)))
    print('Modified time:', time.ctime(os.path.getmtime(__file__)))
    print('Change time  :', time.ctime(os.path.getctime(__file__)))
    print('Size         :', os.path.getsize(__file__))
except Exception as err:
    print(type(err),err)

<class 'NameError'> name '__file__' is not defined


os.path.getatime() returns the access time, os.path.getmtime() returns the modification time, and os.path.getctime() returns the creation time. os.path.getsize() returns the amount of data in the file, represented in bytes.

Note: \__file__ applies to modules and Python scripts, not to notebooks. 

## Testing Files

When a program encounters a path name, it often needs to know whether the path refers to a file, directory, or symlink and whether it exists. os.path includes functions for testing all of these conditions.

In [50]:
# ospath_tests.py
import os.path

try:
    FILENAMES = [
#         __file__,
#         os.path.dirname(__file__),
        '/',
        './broken_link',
    ]
    for file in FILENAMES:
        print('File        : {!r}'.format(file))
        print('Absolute    :', os.path.isabs(file))
        print('Is File?    :', os.path.isfile(file))
        print('Is Dir?     :', os.path.isdir(file))
        print('Is Link?    :', os.path.islink(file))
        print('Mountpoint? :', os.path.ismount(file))
        print('Exists?     :', os.path.exists(file))
        print('Link Exists?:', os.path.lexists(file))
        print()
except Exception as err:
    print(type(err),err)

File        : '/'
Absolute    : True
Is File?    : False
Is Dir?     : True
Is Link?    : False
Mountpoint? : True
Exists?     : True
Link Exists?: True

File        : './broken_link'
Absolute    : False
Is File?    : False
Is Dir?     : False
Is Link?    : False
Mountpoint? : False
Exists?     : False
Link Exists?: False

