---

**Q9: Log File Tail Reader**  
*Medium-Hard | Level 5*  
Write a function `tail(file_path, N)` that returns the last `N` lines from a log file.  
- Input: `'app.log', 10`  
- Output: Last 10 lines as a list.

---

In [32]:
def tail(file_path, N):
    if not isinstance(file_path,str):
        raise TypeError(f'Expected file path here')
    if not isinstance(N,int) or N<=0:
        raise ValueError(f'Expected positive integers')
  
    try:
        with open(file_path,'r',encoding='utf-8') as file:
            lines = list(file)
            return lines[-N:]
    except FileNotFoundError:
        raise FileNotFoundError(f"File not found: {file_path}")
    except PermissionError:
        raise PermissionError(f"No read permission: {file_path}")
    except OSError as e:
        raise OSError(f"Failed to read file: {e}")

In [33]:
file_path = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/application.log'
num = 2
response = tail(file_path,num)
print(response)
print(len(response))

['2023-10-15 08:32:05 WARN  [Security] 3 failed login attempts from 203.0.113.42  \n', '2023-10-15 08:32:30 DEBUG [Worker-4] Processing completed in 142ms  ']
2


### More optimized code

In [3]:
def tail_large_file(file_path, N, buffer_size=1024):
    if not isinstance(file_path, str):
        raise TypeError("File path must be a string.")
    if not isinstance(N, int) or N <= 0:
        raise ValueError("N must be a positive integer.")
        
    try:
        with open(file_path) as f:
            lines = f.readlines()
            return lines[-N:] if len(lines) >= N else lines
    except FileNotFoundError:
        return "File not found."
    except PermissionError:
        raise PermissionError(f"No read permission: {file_path}")
    except OSError as e:
        raise OSError(f"Failed to read file: {e}")

In [4]:
file_path = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/application.log'
num = 2
response = tail_large_file(file_path,num)
print(response)
print(len(response))

['2023-10-15 08:32:05 WARN  [Security] 3 failed login attempts from 203.0.113.42  \n', '2023-10-15 08:32:30 DEBUG [Worker-4] Processing completed in 142ms  ']
2



---

**Q10: Split Large File into Chunks**  
*Medium-Hard | Level 5*  
Write a function `split_file(file_path, lines_per_file)` that splits a large file into multiple smaller files with `lines_per_file` lines each.  
- Input: `data.txt, 1000`  
- Output: Files like `data_part1.txt`, `data_part2.txt`, etc.

---

In [11]:
import os

def split_file(file_path, lines_per_file):
    
    if not isinstance(file_path, str):
        raise TypeError("File path must be a string.")
    if not isinstance(lines_per_file, int) or lines_per_file <= 0:
        raise ValueError("lines_per_file must be a positive integer.")

    
    if not os.path.exists(file_path):
        raise FileNotFoundError("File not found.")
    if os.stat(file_path).st_size == 0:
        print("The source file is empty. No parts created.")
        return

    part_number = 1
    current_output_file = None

    try:
        with open(file_path, 'r', encoding='utf-8') as source_file:
            for line_number, line in enumerate(source_file, start=1):

                if (line_number - 1) % lines_per_file == 0:
                    if current_output_file is not None:
                        current_output_file.close()

                    base_name = os.path.splitext(os.path.basename(file_path))[0]
                    directory = os.path.dirname(file_path)
                    new_file_name = os.path.join(directory, f"{base_name}_part_{part_number}.txt")

                    current_output_file = open(new_file_name, 'w', encoding='utf-8')
                    print(f"Creating: {new_file_name}")
                    part_number += 1

               
                current_output_file.write(line)

    except PermissionError:
        raise PermissionError(f"No write permission for file: {current_output_file}")
    except OSError as e:
        raise OSError(f"Failed to read or write file: {e}")
    finally:
        if current_output_file is not None:
            current_output_file.close()
        print("Splitting complete!")


In [12]:
file_path = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split.txt'
lines_per_file = 5
split_file(file_path,lines_per_file)

Creating: C:/Users/Mahbub/Desktop/Data Engineering/Python/data\split_part_1.txt
Creating: C:/Users/Mahbub/Desktop/Data Engineering/Python/data\split_part_2.txt
Creating: C:/Users/Mahbub/Desktop/Data Engineering/Python/data\split_part_3.txt
Splitting complete!



---

### 🧠 Line 1:
```python
base_name = os.path.splitext(os.path.basename(file_path))[0]
```

👉 **Meaning:**  
- `os.path.basename(file_path)` → Extracts just the **file name** from a full path.  
Example:  
```python
'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split.txt'
```
becomes:  
```
split.txt
```

- `os.path.splitext(...)[0]` → Splits off the **extension** (`.txt`).  
So:
```
split.txt  →  split
```

✅ So this gives you the file name **without** `.txt` — only:  
```
base_name = 'split'
```

---

### 🧠 Line 2:
```python
directory = os.path.dirname(file_path)
```

👉 **Meaning:**  
This extracts only the **folder path** from the full path.  

Given:
```
'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split.txt'
```
this will output:
```
'C:/Users/Mahbub/Desktop/Data Engineering/Python/data'
```

✅ So you can save new files in the **same folder**.

---

### 🧠 Line 3:
```python
new_file_name = os.path.join(directory, f"{base_name}_part_{part_number}.txt")
```

👉 **Meaning:**  
- `os.path.join(directory, ...)` → Combines the directory and new filename in a safe way (works on Windows, Linux, Mac).  
- `f"{base_name}_part_{part_number}.txt"` → Makes the new filename, like:
```
split_part_1.txt
split_part_2.txt
...
```

---

✅ **Full Example:**  

Let’s say:  
```python
file_path = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split.txt'
part_number = 1
```

The 3 lines will do:  
```python
base_name = 'split'  
directory = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data'  
new_file_name = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split_part_1.txt'
```

---

💡 **Why this is smart:**  
- Avoids hardcoding paths.
- Makes your code platform-independent.
- No worry about slashes (`/` or `\`) on Windows/Linux/Mac.

---

✅ **Simple Summary:**  
These 3 lines:  
➡️ Grab the original folder,  
➡️ Create a new filename like `split_part_1.txt`,  
➡️ Save it in the same folder as the original file.

---

---

**Q11: Merge Multiple Files**  
*Medium-Hard | Level 6*  
Write a function `merge_files(file_list, output_file)` that merges multiple text files into one single file.  
- Input: `['file1.txt', 'file2.txt']`  
- Output: `merged.txt`

---

In [16]:
import os

def merge_files(file_list,output_file):
    if not isinstance(file_list, list) or not all(isinstance(f,str) for f in file_list):
        raise TypeError("File list must be a list type holding string file paths")
    if not isinstance(output_file, str):
        raise ValueError("output file must be string path")
        
        
    try:
        with open(output_file,'w',encoding='utf-8') as outfile:
            for file_path in file_list:
                if not os.path.exists(file_path):
                    print(f"Skipping: {file_path} (File not found)")
                    continue
                with open(file_path,'r',encoding='utf-8') as infile:
                    content = infile.read()
                    outfile.write(content)
                    outfile.write('\n')
                    print(f'merged file - {file_path}')
                
            print(f"✅ All files merged into: {output_file}")
        
    except PermissionError:
        raise PermissionError(f"No write permission for file: {output_file}")
    except OSError as e:
        raise OSError(f"Failed to read or write file: {e}")

In [17]:
file_list =['C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split_part_1.txt',
           'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split_part_2.txt',
           'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split_part_3.txt']
output_file = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/merged.txt'

merge_files(file_list,output_file)

merged file - C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split_part_1.txt
merged file - C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split_part_2.txt
merged file - C:/Users/Mahbub/Desktop/Data Engineering/Python/data/split_part_3.txt
✅ All files merged into: C:/Users/Mahbub/Desktop/Data Engineering/Python/data/merged.txt


---

**Q12: Count Word Frequency in File**  
*Hard | Level 6*  
Write a function `word_frequency(file_path)` that returns a dictionary of each word and its frequency.  
- Input: `'data.txt'`  
- Output: `{'word1': 5, 'word2': 3}`

---

In [25]:
import re

def word_frequency(file_path):
    if not isinstance(file_path, str):
        raise ValueError("file path must be a string")
    output = {}   
    try:
        with open(file_path,'r',encoding='utf-8') as file:
            content = file.read().lower()
            content = re.sub(r'[^\w\s]', '', content) # Removing punctuation
            words = content.split()
            
            for word in words:
                if word in output:
                    output[word]+=1
                else:
                    output[word] = 1
                    
            return output
        
    
    except PermissionError:
        raise PermissionError(f"No write permission for file: {file_path}")
    except OSError as e:
        raise OSError(f"Failed to read or write file: {e}")          
            

In [26]:
file_path = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/info.log'

result = word_frequency(file_path)
print(result)

{'mahbub': 1, 'hossain': 2, 'faisal': 1, 'raihan': 1, 'asif': 1, 'abeer': 1, 'imran': 1}


---

**Q13: Replace Specific Word in Large File**  
*Hard | Level 7*  
Write a function `replace_word(file_path, old_word, new_word)` that replaces all instances of `old_word` with `new_word` in the same file, efficiently handling large files (avoid loading entire file into memory).

---

In [5]:
import os

def replace_word(file_path, old_word, new_word):
    if not isinstance(file_path, str):
        raise ValueError("file path must be a string")
    if not isinstance(file_path, str):
        raise ValueError("old_word must be a string")
    if not isinstance(file_path, str):
        raise ValueError("new_word must be a string")
        
    temp_path = file_path + '.tmp'
    
    try:
        with open(file_path, 'r', encoding='utf-8') as input_file, open(temp_path,'w',encoding='utf-8') as output_file:
            for line in input_file:
                updated_line = line.replace(old_word,new_word)
                output_file.write(updated_line)
                
        os.replace(temp_path, file_path)
        print('Replacement done!')
    except PermissionError:
        raise PermissionError(f"No write permission for file: {output_file}")
    except OSError as e:
        raise OSError(f"Failed to read or write file: {e}")
    except Exception as e:
        if os.path.exists(temp_file_path):
            os.remove(temp_file_path)
        raise e

In [6]:
file_path = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/replace.txt'

old_word = 'big'
new_word = 'small'

replace_word(file_path, old_word, new_word)

Replacement done!


---

**Q14: File Metadata Extractor**  
*Hard | Level 7*  
Write a function `file_metadata(file_path)` that returns:  
- file size (bytes)  
- last modified time  
- creation time.

---

In [7]:
import os
import time
import platform
from datetime import datetime

def file_metadata(file_path):

    if not os.path.exists(file_path):
        raise FileNotFoundError(f"The file {file_path} does not exist")
    
    if not os.path.isfile(file_path):
        raise ValueError(f"{file_path} is not a regular file")
    
   
    stat_info = os.stat(file_path)
    
    
    size_bytes = stat_info.st_size
    
    
    last_modified = datetime.fromtimestamp(stat_info.st_mtime).isoformat()
    
   
    system = platform.system()
    if system == 'Windows':
        creation_time = datetime.fromtimestamp(stat_info.st_ctime).isoformat()
    elif system == 'Darwin':  # macOS
        creation_time = datetime.fromtimestamp(stat_info.st_birthtime).isoformat()
    else:  # Linux and others - st_ctime is actually metadata change time
        try:
            creation_time = datetime.fromtimestamp(stat_info.st_birthtime).isoformat()
        except AttributeError:
        
            creation_time = datetime.fromtimestamp(stat_info.st_ctime).isoformat()
    
    return {
        'size_bytes': size_bytes,
        'last_modified': last_modified,
        'creation_time': creation_time
    }

In [8]:
file_path = 'C:/Users/Mahbub/Desktop/Data Engineering/Python/data/replace.txt'


response = file_metadata(file_path)
print(response)

{'size_bytes': 66, 'last_modified': '2025-04-23T02:12:08.838062', 'creation_time': '2025-04-23T02:11:59.899045'}
