# Instructions :
The goal of the exercise is to create a class that will help you analyze a specific text. A text can be just a simple string, like “Today, is a happy day” or it can be an external text file.



# Part I
First, we will analyze a simple string, like “A good book would sometimes cost as much as a good house.”

1. Create a class called Text that takes a string as an argument and store the text in a attribute.
*Hint*: You need to manually copy-paste the text, straight into the code

2. Implement the following methods:
    - a method to return the frequency of a word in the text (assume words are separated by whitespace) return None or a meaningful message.
    - a method that returns the most common word in the text.
    - a method that returns a list of all the unique words in the text.

In [1]:
class Text:
    def __init__(self, text):
        self.text = text
    
    def word_frequency(self, word):
        words = self.text.split()
        count = words.count(word)
        if count == 0:
            return f"The word '{word}' does not appear in the text."
        return f"The word '{word}' appears {count} times."

    def most_common_word(self):
        words = self.text.split()
        word_count = {word: words.count(word) for word in set(words)}
        most_common = max(word_count, key=word_count.get)
        return f"The most common word is '{most_common}', appearing {word_count[most_common]} times."

    def unique_words(self):
        words = self.text.split()
        unique = set(words)
        return list(unique)

# Example usage
text_example = "A good book would sometimes cost as much as a good house."
text_analysis = Text(text_example)
print(text_analysis.word_frequency("good"))  # Frequency of a specific word
print(text_analysis.most_common_word())      # Most common word
print(text_analysis.unique_words())          # List of unique words

The word 'good' appears 2 times.
The most common word is 'as', appearing 2 times.
['would', 'cost', 'sometimes', 'as', 'a', 'A', 'good', 'much', 'house.', 'book']


# Part II
Then, we will analyze a text coming from an external text file. Download the_stranger.txt file.

1. Implement a classmethod that returns a Text instance but with a text file:

`    >>> Text.from_file('the_stranger.txt')`  
*Hint*: You need to open and read the text from the text file.


2. Now, use the provided the_stranger.txt file and try using the class you created above.

In [3]:
class Text:
    def __init__(self, text):
        self.text = text

    @classmethod
    def from_file(cls, file_path):
        with open(file_path, 'r') as file:
            content = file.read()
        return cls(content)
    
    def word_frequency(self, word):
        words = self.text.split()
        count = words.count(word)
        if count == 0:
            return f"The word '{word}' does not appear in the text."
        return f"The word '{word}' appears {count} times."

    def most_common_word(self):
        words = self.text.split()
        word_count = {word: words.count(word) for word in set(words)}
        most_common = max(word_count, key=word_count.get)
        return f"The most common word is '{most_common}', appearing {word_count[most_common]} times."

    def unique_words(self):
        words = self.text.split()
        unique = set(words)
        return list(unique)

# Using the file 'the_stranger.txt'
text_analysis = Text.from_file('the_stranger.txt')
print(text_analysis.most_common_word())  # Analyze most common word in the file
print(text_analysis.unique_words())      # List of unique words in the file


The most common word is 'the', appearing 1807 times.


# Bonus:
1. Create a class called TextModification that inherits from Text.

2. Implement the following methods:
    - a method that returns the text without any punctuation.
    - a method that returns the text without any english stop-words (check out what this is !!).
    - a method that returns the text without any special characters.
    
*Note*: Instead of creating a child class, you could also implements those methods as static methods in the Text class.

*Note*: Feel free to implement/create any attribute, method or function needed to make this work, be creative :)

In [7]:
import string
import nltk

nltk.download('stopwords')

class TextModification(Text):
    def remove_punctuation(self):
        no_punctuation = self.text.translate(str.maketrans('', '', string.punctuation))
        return no_punctuation

    def remove_stopwords(self):
        stop_words = set(stopwords.words('english'))
        words = self.text.split()
        filtered_text = ' '.join(word for word in words if word.lower() not in stop_words)
        return filtered_text

    def remove_special_characters(self):
        cleaned_text = ''.join(char for char in self.text if char.isalnum() or char.isspace())
        return cleaned_text

# Using TextModification with 'the_stranger.txt'
text_mod = TextModification.from_file('the_stranger.txt')
print(text_mod.remove_punctuation())        # Text without punctuation
print(text_mod.remove_stopwords())          # Text without stopwords
print(text_mod.remove_special_characters()) # Text without special characters


[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\d1411\AppData\Roaming\nltk_data...


Albert Camus ♦ THE STRANGER 



THE 



Stranger 



By ALBERT CAMUS 



Translated from the French 
by Stuart Gilbert 




VINTAGE BOOKS 

A Division of Random House 



NEW YORK 



Albert Camus ♦ THE STRANGER 



VINTAGE BOOKS 

are published by Alfred A Knopf Inc 
and Random House Inc 

Copyright 1942 by Librairie Gallimard as LETRANGER 

Copyright 1946 by ALFRED A KNOPF INC All rights reserved No part of this book may be reproduced in any form without permission in 
writing from the publisher except by a reviewer who may quote brief passages in a review to be printed in a magazine or newspaper Manufactured 
in the United States of America Distributed in Canada by Random House of Canada Limited Toronto 



Albert Camus ♦ THE STRANGER 



Contents 

Contents 3 

Part One 4 

1 4 

II 14 

III 18 

IV 24 

V 28 

VI 32 

Part Two 40 

1 40 

II 46 

III 52 

IV 62 

V 68 

About the Author 77 



Albert Camus ♦ THE STRANGER 



Part One 



MOTHER died today Or maybe yesterday I cant

[nltk_data]   Unzipping corpora\stopwords.zip.
