<a href="https://colab.research.google.com/github/chinmairam/Python-Relearn-2026/blob/main/Python_Projects.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**1. Smart Text Cleaner & Analyzer**

The line **content.translate(str.maketrans('', '', string.punctuation))** is used for cleaning text by removing all punctuation. Here's a breakdown:

**string.punctuation:** This is a string containing all standard punctuation characters (e.g., !"#$%&'()*+,-./:;<=>?@[\]^_{|}~`).

**str.maketrans('', '', string.punctuation):** This static method creates a translation table. In this case, the first two arguments are empty strings, meaning no characters will be mapped to others (replaced). The third argument, string.punctuation, specifies characters to be deleted during the translation process.

**content.translate(...):** This string method applies the created translation table to the content string. As a result, all punctuation characters present in content will be removed.


In [None]:
import re
import string
import os
from collections import Counter

def fileAnalyzer(filename, top_n, output_filename=None):
  try:
    with open(filename, 'r') as f:
      text = f.read()
  except FileNotFoundError:
    if output_filename:
      with open(output_filename, 'w') as out_f:
        out_f.write(f"Error: The file '{filename}' was not found.\n")
    else:
      print(f"Error: The file '{filename}' was not found.")
    return

  # Clean the text: convert to lowercase and remove punctuation
  content = text.lower()
  content = content.translate(str.maketrans('', '', string.punctuation))

  words = content.split()
  unique_words = set(words)
  sentences = text.split('.')
  # Use Counter to efficiently count occurrences
  word_counts = Counter(words)

  # Get the top N most common words
  top_keywords = word_counts.most_common(top_n)
  base_filename = os.path.basename(filename).split('/')[-1]

  output_lines = []
  output_lines.append(f"Total words: {len(words)}\n")
  output_lines.append(f"Unique words: {len(unique_words)}\n")
  output_lines.append(f"Total sentences: {len(sentences)}\n")
  output_lines.append(f"Top {top_n} keywords in '{base_filename}':\n")
  # Print the keywords and their counts using f-strings
  for word, count in top_keywords:
    output_lines.append(f"â€¢ '{word}': {count} occurrences\n")

  if output_filename:
    with open(output_filename, 'w') as out_f:
      out_f.writelines(output_lines)
    print(f"Analysis results written to '{output_filename}'")
  else:
    for line in output_lines:
      print(line.strip())

# If sample_text file is not available, look in 2026 Python Projects Folder in Drive
fileAnalyzer('/content/drive/MyDrive/2026 Python Projects/sample_text.txt', 5, '/content/sample_data/analyzed_text.txt')

Analysis results written to '/content/sample_data/analyzed_text.txt'
