# **what is a file in Python?**

A file is an OS resource that stores bytes. Python exposes file handling via high-level objects (file objects) that wrap OS I/O.

Files operate in text mode (decode/encode bytes ↔ Unicode strings) or binary mode (pure bytes).

File operations are I/O bound — they interact with disk, remote filesystems, or streams. They can be blocking and expensive.

Important: always close a file when done (use with context manager to ensure this even on exceptions).

# **file — path (string, bytes, or pathlib.Path).**

mode — string specifying how to open (see below).

encoding — text encoding (e.g., 'utf-8') — REQUIRED for text mode when non-default or when portability matters.

buffering — controls buffering; 0 unbuffered (binary only), 1 line buffered (text), >1 buffer size, -1 default.

errors — how to handle encoding errors ('strict', 'ignore', 'replace', etc).

newline — controls universal newline translation in text mode.

Mode characters (can be combined)

'r' — read (default). File must exist.

'w' — write: truncates file or creates new.

'x' — exclusive creation: fails if file exists (useful to avoid races).

'a' — append: write to end; creates file if missing.

'b' — binary mode (combine with other modes: 'rb', 'wb').

't' — text mode (default).

'+' — update (read and write). e.g. 'r+', 'w+', 'a+'.

'U' — universal newline (deprecated in Python 3; use newline=None).

# **file object basics — attributes & methods**

Common methods and attributes on file objects:

.read(size=-1) — read up to size characters/bytes. -1 reads entire file.

.readline(size=-1) — read a single line.

.readlines() — read all lines into a list (memory heavy).

.write(s) — write string (text mode) or bytes (binary). Returns number of characters/bytes written.

.writelines(iterable_of_strings) — write a sequence (no separators added).

.seek(offset, whence=0) — move file pointer: whence=0 start, 1 current, 2 end.

.tell() — current file pointer offset (in bytes in binary, in codepoints/characters in text mode as documented).

.flush() — flush Python buffer to OS.

.close() — close file. with does this automatically.

.fileno() — OS file descriptor (integer). Useful for low-level ops like os.fsync(f.fileno()).

.encoding — encoding used (text file).

.mode — mode string used to open the file.

# **Writing to files — patterns and best practice**

In [None]:
#Basic write (text)
with open('example.txt', 'w', encoding='utf-8') as f:
    f.write('Hello, world\n')
    f.write('Second line\n')


#'w' truncates existing file. Use 'a' to append.

#Always specify encoding for portability; default encoding depends on platform.

In [None]:
#Write a list of lines
lines = ['line1\n', 'line2\n']
with open('lines.txt', 'w', encoding='utf-8') as f:
    f.writelines(lines)

# Python File Handling —



---

# 1. Key concepts — what is a file in Python?

* A *file* is an OS resource that stores bytes. Python exposes file handling via high-level objects (file objects) that wrap OS I/O.
* Files operate in **text mode** (decode/encode bytes ↔ Unicode strings) or **binary mode** (pure bytes).
* File operations are I/O bound — they interact with disk, remote filesystems, or streams. They can be blocking and expensive.
* Important: always **close** a file when done (use `with` context manager to ensure this even on exceptions).

---

# 2. Opening files — `open()` and modes

## Syntax

```python
f = open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
```

* `file` — path (string, bytes, or `pathlib.Path`).
* `mode` — string specifying how to open (see below).
* `encoding` — text encoding (e.g., `'utf-8'`) — REQUIRED for text mode when non-default or when portability matters.
* `buffering` — controls buffering; `0` unbuffered (binary only), `1` line buffered (text), `>1` buffer size, `-1` default.
* `errors` — how to handle encoding errors (`'strict'`, `'ignore'`, `'replace'`, etc).
* `newline` — controls universal newline translation in text mode.

## Mode characters (can be combined)

* `'r'` — read (default). File must exist.
* `'w'` — write: truncates file or creates new.
* `'x'` — exclusive creation: fails if file exists (useful to avoid races).
* `'a'` — append: write to end; creates file if missing.
* `'b'` — binary mode (combine with other modes: `'rb'`, `'wb'`).
* `'t'` — text mode (default).
* `'+'` — update (read and write). e.g. `'r+'`, `'w+'`, `'a+'`.
* `'U'` — universal newline (deprecated in Python 3; use `newline=None`).

**Examples**

```python
f = open('notes.txt', 'r', encoding='utf-8')     # read text
f = open('data.bin', 'wb')                       # write binary
f = open('append.log', 'a', encoding='utf-8')    # append text
f = open('both.txt', 'r+', encoding='utf-8')     # read and write
```

**Prefer** `with open(...) as f:` (context manager) to ensure file closure.

---

# 3. File object basics — attributes & methods

Common methods and attributes on file objects:

* `.read(size=-1)` — read up to `size` characters/bytes. `-1` reads entire file.
* `.readline(size=-1)` — read a single line.
* `.readlines()` — read all lines into a list (memory heavy).
* `.write(s)` — write string (text mode) or bytes (binary). Returns number of characters/bytes written.
* `.writelines(iterable_of_strings)` — write a sequence (no separators added).
* `.seek(offset, whence=0)` — move file pointer: `whence=0` start, `1` current, `2` end.
* `.tell()` — current file pointer offset (in bytes in binary, in codepoints/characters in text mode as documented).
* `.flush()` — flush Python buffer to OS.
* `.close()` — close file. `with` does this automatically.
* `.fileno()` — OS file descriptor (integer). Useful for low-level ops like `os.fsync(f.fileno())`.
* `.encoding` — encoding used (text file).
* `.mode` — mode string used to open the file.

---

# 4. Writing to files — patterns and best practice

## Basic write (text)

```python
with open('example.txt', 'w', encoding='utf-8') as f:
    f.write('Hello, world\n')
    f.write('Second line\n')
```

* `'w'` truncates existing file. Use `'a'` to append.
* Always specify `encoding` for portability; default encoding depends on platform.

## Write a list of lines

```python
lines = ['line1\n', 'line2\n']
with open('lines.txt', 'w', encoding='utf-8') as f:
    f.writelines(lines)  # note: no newline added automatically
```

## Binary write (bytes)

```python
img_bytes = b'\x89PNG...'  # some bytes
with open('image.png', 'wb') as f:
    f.write(img_bytes)
```

## Flushing & ensuring data is on disk

* `.flush()` flushes Python’s buffers to OS. OS may still cache.
* To force write to physical disk:

```python
with open('data.bin', 'wb') as f:
    f.write(b'important')
    f.flush()
    os.fsync(f.fileno())  # import os
```

`os.fsync()` is expensive; use only for critical durability guarantees.

## Writing atomically (safe writes)

When you need to replace a file safely (avoid partial files on crash), write to a temporary file and atomically replace:

```python
import os
import tempfile

def atomic_write(path, data, mode='w', encoding='utf-8'):
    dirpath = os.path.dirname(path) or '.'
    fd, tmp_path = tempfile.mkstemp(dir=dirpath)
    try:
        with os.fdopen(fd, mode, encoding=encoding) as tmp:
            tmp.write(data)
            tmp.flush()
            os.fsync(tmp.fileno())
        os.replace(tmp_path, path)  # atomic on most OSes
    finally:
        try:
            os.remove(tmp_path)
        except FileNotFoundError:
            pass
```

`os.replace()` atomically renames the temp file over the target (POSIX and Windows semantics are safe). Use `tempfile.NamedTemporaryFile(..., delete=False)` or `mkstemp()` for control.

---

# 5. Reading files — methods, patterns, and large-file strategies

## Read whole file (small files)

```python
with open('example.txt', 'r', encoding='utf-8') as f:
    content = f.read()   # returns single string
```

**Caution:** `read()` loads entire file into memory — avoid on large files.

## Read line by line (memory-efficient)

```python
with open('big.log', 'r', encoding='utf-8') as f:
    for line in f:   # iterator yields one line at a time
        process(line)
```

This is the recommended pattern for large text files.

## `readline()` vs iteration

* `readline()` reads a single line (useful when you want manual control).
* Iteration over file (`for line in f:`) is buffered and efficient.

## Read fixed-size chunks (binary or streaming)

```python
with open('video.mp4', 'rb') as f:
    while True:
        chunk = f.read(1024*1024)  # 1 MiB
        if not chunk:
            break
        process(chunk)
```

Use chunked reading for streaming large binary files or uploads.

## `readlines()` and `sizehint`

`readlines(hint)` reads lines into a list; `hint` suggests total bytes to read. `readlines()` can still be memory heavy; prefer iteration.

## Handling unknown/various encodings

* If you don’t know encoding, detect with a library (`chardet`, `charset-normalizer`) or read in binary then decode with fallback:

```python
raw = open('weird.txt', 'rb').read()
for enc in ('utf-8', 'latin-1', 'windows-1252'):
    try:
        text = raw.decode(enc)
        break
    except UnicodeDecodeError:
        pass
```

Better: use `errors='replace'` to avoid exceptions while reading:

```python
with open('weird.txt', 'r', encoding='utf-8', errors='replace') as f:
    text = f.read()
```

## Reading newline variations

Use `newline=None` (default) to enable universal newlines: `'\r'`, `'\n'`, `'\r\n'` all mapped to `'\n'` in read strings.

---

# 6. File pointer operations — `seek`, `tell`, and random access

* `f.tell()` returns current offset.
* `f.seek(offset, whence)` moves pointer:

  * `whence=0` — from start (default)
  * `whence=1` — from current position
  * `whence=2` — from file end

Example: read last 100 bytes

```python
with open('file.bin', 'rb') as f:
    f.seek(-100, 2)
    data = f.read(100)
```

**Note:** In text mode, using `seek` and `tell` interacts with encoded byte offsets; for precise byte seeking use binary mode.

---

# 7. Binary files, `mmap`, and working with bytes

## Binary vs text

* Text mode: Python decodes bytes to str using `encoding`.
* Binary mode: read/write `bytes` objects; needed for images, archives, compressed files.

## Memory-mapped files (`mmap`) — random access without reading whole file

```python
import mmap

with open('bigfile.bin', 'r+b') as f:  # must be read-write or use ACCESS_READ
    mm = mmap.mmap(f.fileno(), 0)      # map whole file
    # example: find substring
    i = mm.find(b'needle')
    if i != -1:
        print('found at', i)
    mm.close()
```

`mmap` can be very efficient for random reads on large files. Careful with platform differences and file size.

---

# 8. File metadata & permissions (os, pathlib)

## `os` module

```python
import os, stat

os.path.exists('path')
os.path.isfile('path')
os.path.isdir('path')
os.stat('file')            # returns stat result (mode, size, mtime, etc)
os.remove('file')         # delete file
os.rename(src, dst)       # simple rename
os.replace(src, dst)      # atomic replace
os.chmod(path, 0o644)     # change permissions
```

`os.stat()` gives `st_mode`, `st_size`, `st_mtime` (modification time), etc.

## `pathlib` — modern, object-oriented paths

```python
from pathlib import Path

p = Path('example.txt')
p.exists()
p.is_file()
p.read_text(encoding='utf-8')
p.write_text('hello', encoding='utf-8')
p.rename('newname.txt')
p.unlink()  # delete
p.stat()
```

`pathlib` is highly recommended for cleaner code.

---

# 9. Renaming, moving, and deleting files — safe ways

## Rename (simple)

```python
import os
os.rename('old.txt', 'new.txt')   # will error if target exists on some platforms
```

## Move across filesystems

Use `shutil.move()` — moves or renames; will copy+delete if across filesystems:

```python
import shutil
shutil.move('file.txt', '/other/fs/file.txt')
```

## Atomic replace

To replace target safely and atomically:

```python
os.replace('tempfile', 'targetfile')  # always replaces atomically on supported OS
```

## Delete

```python
os.remove('file')    # remove file
os.unlink('file')    # same as remove
Path('file').unlink()  # pathlib
```

If you want to ignore missing files:

```python
try:
    os.remove('file')
except FileNotFoundError:
    pass
```

## Permanently deleting directories

* `os.rmdir(path)` removes empty directory.
* `shutil.rmtree(path)` removes directory tree (use with caution).

**Tip:** Be careful with `shutil.rmtree` — accidental deletes can be disastrous. Consider sending files to a trash library (third-party) if you want safer deletion.

---

# 10. Temporary files & directories (`tempfile`)

Use `tempfile` for safe temporary files:

```python
import tempfile

# Temporary file (auto deleted on close if delete=True)
with tempfile.TemporaryFile() as tf:
    tf.write(b'hello')
    tf.seek(0)
    print(tf.read())

# Named temporary file (useful when another process needs to open it)
with tempfile.NamedTemporaryFile(delete=False) as ntf:
    print(ntf.name)
    ntf.write(b'data')
# remember to remove file later

# Temporary directory
with tempfile.TemporaryDirectory() as tmpdir:
    print('tmp dir:', tmpdir)
    # use files inside tmpdir
```

Use the `dir` parameter to control where temp files are created (useful for same filesystem atomic moves).

---

# 11. Concurrency & file locking (brief but practical)

Files are OS resources; concurrent access may lead to races:

* **Readers** can often access simultaneously.
* **Writers** can race: two writers writing simultaneously can corrupt files.

### Advisory locking (platform differences)

* On **Unix**: use `fcntl` (`fcntl.flock`) for advisory locks.
* On **Windows**: use `msvcrt.locking` or `pywin32`.
* Advisory locks are not enforced by OS unless all cooperating processes use the locking protocol.

Example (Unix advisory lock):

```python
import fcntl

with open('shared.txt', 'r+') as f:
    fcntl.flock(f, fcntl.LOCK_EX)   # exclusive
    try:
        # safely read/modify/write
        f.seek(0)
        data = f.read()
        f.seek(0)
        f.write('updated')
        f.truncate()
    finally:
        fcntl.flock(f, fcntl.LOCK_UN)
```

**Cross-platform libraries**: use `portalocker` (third-party) or design your application to avoid shared writable files (use databases, message queues, or per-process files plus atomic rename).

---

# 12. Security best practices (important)

* **Avoid path traversal:** never concatenate untrusted input into filenames naively. Use `pathlib.Path` and validate. Example: reject `..` or use `os.path.abspath` and check it lies within allowed directory.
* **Use safe temporary directories** (OS temp dirs are fine).
* **Avoid pickle for untrusted data** — pickle can execute arbitrary code on load. Prefer JSON for untrusted serialized data.
* **Least privilege:** set file permissions appropriately; don’t create world-writable files unless necessary.
* **Sanitize filenames** if derived from user-supplied strings.

---

# 13. Useful recipes — ready code snippets

## 13.1 Read entire file safely

```python
from pathlib import Path

def read_text_file(path, encoding='utf-8'):
    p = Path(path)
    return p.read_text(encoding=encoding)

print(read_text_file('notes.txt'))
```

## 13.2 Append a line to a text file

```python
def append_line(path, line, encoding='utf-8'):
    with open(path, 'a', encoding=encoding) as f:
        if not line.endswith('\n'):
            line += '\n'
        f.write(line)
```

## 13.3 Stream upload (read in chunks)

```python
def stream_file(path, chunk_size=1024*1024):
    with open(path, 'rb') as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break
            yield chunk
```

## 13.4 Atomic write with `tempfile` + `os.replace`

```python
import tempfile, os

def atomic_write_text(path, text, encoding='utf-8'):
    dirpath = os.path.dirname(path) or '.'
    fd, tmp = tempfile.mkstemp(dir=dirpath)
    try:
        with os.fdopen(fd, 'w', encoding=encoding) as f:
            f.write(text)
            f.flush()
            os.fsync(f.fileno())
        os.replace(tmp, path)
    finally:
        # cleanup in case of failure
        try:
            os.remove(tmp)
        except OSError:
            pass
```

## 13.5 Delete file if exists (safe)

```python
from pathlib import Path

def safe_remove(path):
    p = Path(path)
    try:
        p.unlink()
    except FileNotFoundError:
        pass
```

## 13.6 Read CSV safely (text mode + encoding)

```python
import csv
from pathlib import Path

def read_csv(path, encoding='utf-8'):
    with open(path, newline='', encoding=encoding) as f:
        reader = csv.DictReader(f)
        for row in reader:
            process(row)
```

---

# 14. Advanced topics & notes

## 14.1 `io` module and text vs binary streams

The `io` module provides classes: `TextIOBase`, `BufferedReader/Writer`, `RawIOBase`. `open()` returns `TextIOWrapper` for text files wrapping a buffered binary stream.

## 14.2 File descriptors & low-level OS calls

Use `os.open()` for low-level control (flags, modes), returns file descriptor (int). Must call `os.close(fd)`.

## 14.3 Files and multiprocessing

When forking, open file descriptors are inherited — be mindful. Use `close_fds` where needed or open files after forking.

## 14.4 Working with compressed files

Use `gzip`, `bz2`, `lzma` modules for transparent compressed file handling:

```python
import gzip
with gzip.open('file.txt.gz', 'rt', encoding='utf-8') as f:
    text = f.read()
```

## 14.5 Databases vs files

For concurrent writes, indexing, and queries, prefer a real database (SQLite, PostgreSQL) over ad-hoc file writes.

---

# 15. Common mistakes & how to avoid them

* **Not specifying encoding** → platform-specific bugs. Always set `encoding='utf-8'` unless you have a reason.
* **Using `read()` on huge files** → OOM. Use iteration or chunked reads.
* **Relying on `os.rename` for atomic replacement across filesystems** → may not be atomic; use `os.replace` where available.
* **Assuming `write()` writes all bytes** — it returns number of bytes written; for some low-level operations partial writes may occur (rare with Python’s high-level file object).
* **Ignoring exceptions on file ops** — catch `FileNotFoundError`, `PermissionError` to give clearer user messages.

---

# 16. Quick reference cheatsheet

* Open: `open(path, 'r', encoding='utf-8')`
* Read all: `s = f.read()`
* Read line: `line = f.readline()`
* Iterate lines: `for line in f:`
* Write: `f.write('text')`
* Append: `open(path, 'a')`
* Binary: use `'rb'` / `'wb'`
* Seek / tell: `f.seek(offset, whence)`, `f.tell()`
* Close: `f.close()` or `with open(...) as f:`
* Atomic replace: `os.replace(tempfile, target)`
* Rename/move: `os.rename()` / `shutil.move()`
* Delete: `os.remove()` / `Path(...).unlink()`
* Temp files: `tempfile.TemporaryFile()`, `NamedTemporaryFile()`, `TemporaryDirectory()`

---

# 17. Example: full end-to-end script

A short script that safely reads, modifies, and atomically replaces a text file:

```python
#!/usr/bin/env python3
import os
import tempfile
from pathlib import Path

def modify_file_atomic(path, transform_func, encoding='utf-8'):
    """
    Read text file, apply transform_func(str) -> str, and atomically
    replace original file with new content.
    """
    p = Path(path)
    original_text = p.read_text(encoding=encoding)

    new_text = transform_func(original_text)

    dirpath = p.parent or Path('.')
    fd, tmp = tempfile.mkstemp(dir=str(dirpath))
    try:
        with os.fdopen(fd, 'w', encoding=encoding) as f:
            f.write(new_text)
            f.flush()
            os.fsync(f.fileno())
        os.replace(tmp, str(p))  # atomic replace
    finally:
        try:
            os.remove(tmp)
        except Exception:
            pass

if __name__ == '__main__':
    def add_footer(text):
        return text + '\n-- End of file --\n'
    modify_file_atomic('notes.txt', add_footer)
```

---







---

## 1) Python – Directories

### 🔹 What is a directory?

A directory (also called folder) is a filesystem container that holds files and (possibly) other directories (subdirectories). In Python, you manipulate directories via modules like `os`, `os.path`, `pathlib`, and sometimes `shutil`. The operations include: getting current directory, changing directory, creating a directory, listing contents, deleting a directory, walking a directory tree, etc.

### 🔹 Important directory operations & how to use

#### Get current working directory

```python
import os
cwd = os.getcwd()
print("Current working directory:", cwd)
```

`os.getcwd()` returns the path of the current working directory. ([GeeksforGeeks][1])

#### Change working directory

```python
import os
os.chdir('/path/to/newdir')
print("Now cwd:", os.getcwd())
```

`os.chdir(path)` changes the process’s working directory. ([GeeksforGeeks][1])

#### Create a new directory

```python
import os
os.mkdir('new_folder')  # creates one directory
# If you want to create nested directories:
os.makedirs('parent/child/grandchild', exist_ok=True)
```

* `os.mkdir(path[, mode])` — create a single directory. Will error if already exists (unless you catch exception). ([GeeksforGeeks][1])
* `os.makedirs(path[, mode, exist_ok])` — create directories recursively (all missing parent directories). ([GeeksforGeeks][1])

#### List directory contents

```python
import os
entries = os.listdir('/path/to/dir')
print("Contents:", entries)
```

`os.listdir(path)` returns a list of names (files + directories) in the given path (not including `.` or `..`). ([w3schools.com][2])
You can filter only files or only directories by combining `os.path.isfile()` or `os.path.isdir()` (see next parts).

#### Walk a directory tree (recursive listing)

```python
import os
for root, dirs, files in os.walk('/path/to/dir'):
    print("Root:", root)
    print(" Subdirs:", dirs)
    print(" Files:", files)
```

`os.walk()` yields tuples for each directory in the tree (top down by default) and is useful for recursive operations like find, delete, backup. See StackOverflow example. ([Stack Overflow][3])

#### Delete / remove a directory

```python
import os
os.rmdir('empty_dir')  # only works if directory is empty
# For non-empty directory:
import shutil
shutil.rmtree('dir_with_contents')
```

* `os.rmdir(path)` removes an empty directory. ([TutorialsPoint][4])
* To remove directory tree use `shutil.rmtree()` (not strictly `os`, but related). When using `os.removedirs(path)` you can remove nested directories if they are empty. ([TutorialsPoint][4])

#### Best practices

* Always check if directory exists before creating or deleting: `os.path.exists(path)` or `os.path.isdir(path)`.
* Use `pathlib.Path` for more readable code (though core methods are similar).
* Beware of permissions, race conditions, symbolic links (symlinks) when creating/ deleting directories.

---

## 2) Python – File Methods (OS / filesystem methods for files)

Here “File Methods” refers to functions provided by `os` (and related modules) that operate on files (rather than arrays or reading/writing content). These include renaming, removing, checking metadata, permissions, links, etc.

### 🔹 Major file-related methods and usage

#### Remove / delete a file

```python
import os
os.remove('file.txt')           # removes file
# or alias
os.unlink('file.txt')
```

If the path is not a file or does not exist, raises `FileNotFoundError` or `PermissionError`. ([Programiz][5])

#### Rename / move a file

```python
import os
os.rename('old_name.txt', 'new_name.txt')
```

`os.rename(src, dst)` renames a file or directory (if permitted). It works within same filesystem; moving across different filesystems may copy + delete implicitly or fail. ([TutorialsPoint][4])
For cross-filesystem moves, better to use `shutil.move()`.

#### Check file metadata (stat)

```python
import os
info = os.stat('file.txt')
print("Size:", info.st_size)
print("Last modified:", info.st_mtime)
print("Permissions:", oct(info.st_mode))
```

`os.stat(path)` returns a stat result with many fields: size, timestamps, etc. ([TutorialsPoint][4])

#### Change file mode / permissions

```python
import os
os.chmod('file.txt', 0o644)  
```

`os.chmod(path, mode)` sets the file permission bits. ([w3schools.com][6])

#### Create symbolic link or hard link

```python
import os
os.symlink('source.txt', 'link_to_source.txt')
os.link('source.txt', 'hard_link_source.txt')
```

`os.symlink(src, dst)` creates a symlink; `os.link(src, dst)` a hard link. ([TutorialsPoint][4])

#### Check access / remove all descriptors etc

There are many low-level methods (duplicate file descriptor `os.dup2()`, `os.close()`, `os.access()` to check if you have read/write/execute permissions). Many of these appear in the `os` module list. ([w3schools.com][6])

### 🔹 Example: safe delete with existence check

```python
import os

path = 'old_data.csv'
if os.path.exists(path) and os.path.isfile(path):
    try:
        os.remove(path)
        print("Deleted", path)
    except PermissionError as e:
        print("Permission denied:", e)
else:
    print("Path does not exist or is not a file.")
```

---

## 3) Python – OS File/Directory Methods (overview of key `os` module functions for file + directory management)

### 🔹 Core `os` module functions for filesystem operations

The `os` module provides a broad set of functions for interacting with the operating system — filesystems, directories, environment variables, process control, etc. We’ll focus on file/directory relevant ones.

#### Current working directory & environment

```python
import os
cwd = os.getcwd()                # get current directory
os.chdir('/another/path')        # change directory
env_value = os.getenv('HOME')    # get environment variable
```

`os.getcwd()` / `os.chdir()` are core for directory context. ([GeeksforGeeks][1])

#### Directory operations (some overlap with first section)

* `os.mkdir(path[, mode])` — create directory.
* `os.makedirs(path[, mode, exist_ok])` — create directories recursively.
* `os.listdir(path='.')` — list entries. ([GeeksforGeeks][7])
* `os.removedirs(path)` — remove directories recursively (removes leaf and parent if they become empty). ([TutorialsPoint][4])
* `os.rmdir(path)` — remove single empty directory.
* `os.remove(path)` / `os.unlink(path)` — remove file.
* `os.rename(src, dst)` — rename file or directory.
* `os.replace(src, dst)` — rename, overwriting destination if exists (atomic on many OSes).
* `os.walk(top, topdown=True, onerror=None, followlinks=False)` — walk directory tree.
* `os.scandir(path)` — return iterator of `DirEntry` objects (more efficient than `listdir` + `stat`). [Mentioned in GfG article] ([GeeksforGeeks][8])

#### File descriptor / lower-level operations

* `os.open(file, flags[, mode])` — open file descriptor with low-level control.
* `os.close(fd)` — close descriptor.
* `os.read(fd, n)`, `os.write(fd, data)` — read/write from descriptor.
* `os.lseek(fd, pos, how)` — reposition descriptor pointer.
* `os.fsync(fd)`, `os.fdatasync(fd)` — flush OS buffers. ([w3schools.com][6])

### 🔹 Example: create directory tree, list, rename

```python
import os

root = 'proj_data'
sub1 = os.path.join(root, 'input')
sub2 = os.path.join(root, 'output')

os.makedirs(sub1, exist_ok=True)
os.makedirs(sub2, exist_ok=True)

print("Contents of root:", os.listdir(root))

os.rename(sub2, os.path.join(root, 'results'))
print("After rename:", os.listdir(root))
```

---

## 4) Python – OS Path Methods (`os.path` module) — path manipulation utilities

The `os.path` module (or `pathlib` alternative) provides methods to manipulate **pathnames** (strings/bytes representing file/directory paths) in a way that is cross-platform aware (separator differences, drive letters, symlinks, etc). ([Python documentation][9])

### 🔹 Key methods & what they do

| Method                                                       | Description                                                                                                            |
| ------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------- |
| `os.path.abspath(path)`                                      | Returns a normalized absolute version of the path. ([TutorialsPoint][10])                                              |
| `os.path.basename(path)`                                     | Returns the final component (file or directory name) of the path. ([TutorialsPoint][10])                               |
| `os.path.dirname(path)`                                      | Returns the directory name portion (everything except the final component). ([TutorialsPoint][10])                     |
| `os.path.exists(path)`                                       | Returns `True` if the path (file or directory) exists. ([TutorialsPoint][10])                                          |
| `os.path.isabs(path)`                                        | Returns `True` if the path is absolute. ([GeeksforGeeks][11])                                                          |
| `os.path.isfile(path)`                                       | Returns `True` if the path is an existing regular file. ([TutorialsPoint][10])                                         |
| `os.path.isdir(path)`                                        | Returns `True` if the path is an existing directory. ([GeeksforGeeks][11])                                             |
| `os.path.join(path1, path2, …)`                              | Joins one or more path components intelligently, taking care of separators. ([ioflood.com][12])                        |
| `os.path.normpath(path)`                                     | Normalizes path (collapses `A//B`, `A/./B`, `A/foo/../B` → `A/B`) and adjusts separators for OS. ([GeeksforGeeks][11]) |
| `os.path.realpath(path)`                                     | Returns the canonical path, resolving symlinks. ([TutorialsPoint][10])                                                 |
| `os.path.getsize(path)`                                      | Returns the size of file (in bytes). ([TutorialsPoint][10])                                                            |
| `os.path.getmtime(path)`, `getctime(path)`, `getatime(path)` | Return timestamp of last modified, created, accessed. ([TutorialsPoint][10])                                           |
| `os.path.splitext(path)`                                     | Splits path into (root, extension) pair. ([TutorialsPoint][10])                                                        |

### 🔹 Example usage: path manipulations

```python
import os

p = '../data/./file.txt'
abs_p = os.path.abspath(p)
print("Absolute path:", abs_p)

root, ext = os.path.splitext(abs_p)
print("Root:", root, "Extension:", ext)

dirname = os.path.dirname(abs_p)
basename = os.path.basename(abs_p)
print("Directory:", dirname)
print("Filename:", basename)

joined = os.path.join(dirname, 'otherfile.txt')
print("Joined path:", joined)

if os.path.exists(joined):
    print("File exists and is file?", os.path.isfile(joined))
```

### 🔹 Why these matter & portability

* Paths differ between OSes: Windows uses backslashes `\`, Unix uses forward slash `/`. `os.path.join()` handles correct separator. ([ioflood.com][12])
* Normalizing paths helps avoid duplicates or mismatches when comparing paths (e.g., `A/B/../C` vs `A/C`).
* Checking existence and type (`isfile`, `isdir`) is critical before performing file operations to avoid errors.

---




