# Simple Text Analysis Tool (Command-Line Tool)

This project will create a command-line tool to analyze a text file. The tool will read the file's content and provide basic statistics such as total character count, total word count, total line count, and the frequency of each word.

---

## 1. Initialization and Setup

We'll import necessary modules (`os` to check files, `sys` to read command-line arguments) and create a sample text file for testing.

In [1]:
import os
import sys
import re # Regular expression module for cleaning words

# Create a sample text file for testing
sample_text_content = """
This is a sample text file for analysis.
It contains multiple lines.
Python is powerful, and python is fun!
Let's analyze this text.
""".strip()

with open('sample_text.txt', 'w', encoding='utf-8') as f:
    f.write(sample_text_content)

print("Created 'sample_text.txt' for testing.")

Created 'sample_text.txt' for testing.


---

## 2. Text File Reading Function

The `read_text_file()` function will be responsible for reading the file's content. It will handle errors if the file does not exist.

In [2]:
def read_text_file(filepath):
    """Reads the content of a text file."""
    if not os.path.exists(filepath):
        print(f"Error: File '{filepath}' does not exist.")
        return None
    
    try:
        with open(filepath, 'r', encoding='utf-8') as f:
            content = f.read()
        return content
    except Exception as e:
        print(f"Error reading file '{filepath}': {e}")
        return None

---

## 3. Text Analysis Function

The `analyze_text()` function will perform character, word, and line counts, and calculate word frequencies. We'll use string methods and dictionaries to achieve this.

In [3]:
def analyze_text(text_content):
    """Analyzes text content and returns statistics."""
    if not text_content:
        return {
            'char_count': 0,
            'word_count': 0,
            'line_count': 0,
            'word_frequency': {}
        }

    # Count characters (including spaces and newlines)
    char_count = len(text_content)

    # Count lines
    line_count = text_content.count('\n') + 1 # Add 1 because the last line might not have a \n

    # Count words and word frequency
    # Remove punctuation and convert all to lowercase
    cleaned_text = re.sub(r'[^\w\s]', '', text_content).lower() # Keep letters, numbers, underscores, spaces
    words = cleaned_text.split()
    word_count = len(words)

    word_frequency = {}
    for word in words:
        if word:
            word_frequency[word] = word_frequency.get(word, 0) + 1

    return {
        'char_count': char_count,
        'word_count': word_count,
        'line_count': line_count,
        'word_frequency': word_frequency
    }

---

## 4. Display Analysis Results Function

This function will print the statistics clearly and readably.

In [4]:
def display_analysis_results(results):
    """Displays the text analysis statistics."""
    print("\n--- TEXT ANALYSIS RESULTS ---")
    print(f"Total Characters: {results['char_count']}")
    print(f"Total Words: {results['word_count']}")
    print(f"Total Lines: {results['line_count']}")

    print("\n--- Word Frequency ---")
    if not results['word_frequency']:
        print("No words to analyze.")
        return

    # Sort word frequency by count in descending order
    sorted_word_freq = sorted(results['word_frequency'].items(), key=lambda item: item[1], reverse=True)

    for word, count in sorted_word_freq:
        print(f"'{word}': {count}")
    print("------------------------------------")

---

## 5. Main Tool Logic (Using `sys.argv`)

This is the main part of the script. It will check command-line arguments to get the file path, then call the defined functions to read, analyze, and display the results.

**To run this section:**

If you are in a Jupyter Notebook, you can simulate `sys.argv` by assigning to it manually. For example:

```python
sys.argv = ['your_script_name.py', 'sample_text.txt'] # Replace 'sample_text.txt' with your file name
```

If you run from the terminal, save this code into a `.py` file (e.g., `text_analyzer.py`) and run:

```bash
python text_analyzer.py sample_text.txt
```
Or with another file:
```bash
python text_analyzer.py path/to/your/other_file.txt
```

In [5]:
def main():
    """Main function controlling the text analysis tool."""
    # sys.argv[0] is the script name, sys.argv[1] is the first argument
    if len(sys.argv) < 2:
        print("Usage: python text_analyzer.py <text_file_path>")
        print("Example: python text_analyzer.py sample_text.txt")
        return

    filepath = sys.argv[1]
    print(f"Analyzing file: {filepath}")

    text_content = read_text_file(filepath)

    if text_content is not None:
        analysis_results = analyze_text(text_content)
        display_analysis_results(analysis_results)

# To run in Jupyter Notebook, you need to simulate sys.argv
# Example:
if __name__ == "__main__":
    # Uncomment the line below to test in Jupyter with the sample file
    sys.argv = ['your_script_name.py', 'sample_text.txt'] 
    main()

# Clean up the dummy file after execution
if os.path.exists('sample_text.txt'):
    os.remove('sample_text.txt')
    print("\nRemoved 'sample_text.txt'.")

Analyzing file: sample_text.txt

--- TEXT ANALYSIS RESULTS ---
Total Characters: 132
Total Words: 23
Total Lines: 4

--- Word Frequency ---
'is': 3
'this': 2
'text': 2
'python': 2
'a': 1
'sample': 1
'file': 1
'for': 1
'analysis': 1
'it': 1
'contains': 1
'multiple': 1
'lines': 1
'powerful': 1
'and': 1
'fun': 1
'lets': 1
'analyze': 1
------------------------------------

Removed 'sample_text.txt'.


---

## 6. Conclusion

You have built a simple but effective text analysis tool! This project helped you practice: file reading, string manipulation, using dictionaries for frequency counting, and interacting with command-line arguments. This is a good foundation for you to expand into more complex text analysis tools in the future.