As a Master’s student specializing in Data Storytelling, this project leverages Natural Language Processing (NLP) techniques to analyze user feedback on Avenue's Reclame Aqui page. The objective is to identify key topics and user semantics within the negative feedback, thus providing Avenue's UX team with actionable insights and establishing a data-driven controlled vocabulary.
Natural Language Processing (NLP) involves programming computers to process and analyze large amounts of natural language data. It aims to understand and derive meaningful information from human language in a smart and useful way.
- Tool Used: Octoparse
- Description: User comments were gathered from Avenue's page on Reclame Aqui using Octoparse to scrape data directly from the web.
- Tool Used: Python
- Description: The scraped data was cleaned and structured using Python, making it ready for analysis.
- Tool Used: SpaCy
- Description: Analyzed parts of speech within the feedback to gain deeper linguistic insights.
- Tool Used: LDA Model
- Description: Applied Latent Dirichlet Allocation (LDA) to uncover prevalent themes within the feedback.
- Methods: Heatmaps and Word Clouds
- Purpose: Translated the analysis findings into visual formats to highlight key linguistic trends and topics.
- Revealed a blend of informal language and formal expressions within the feedback, highlighting the varying communication preferences across different platforms.
- Identified critical topics such as Fees, Cards, Inactivity, Charges, Cancellation, Account, and Closure, directing targeted improvements to enhance user engagement.
- The project advanced academic knowledge in NLP and provided practical insights beneficial for content design.
- Established a controlled vocabulary to standardize communication and ensure clarity.
- Set the groundwork for future projects to automate internal tools and expand analysis, enhancing NLP applications in real-world scenarios.
- The necessary data files for running the analysis are located in the
DATA
folder of this GitHub repository. Ensure you have these files downloaded and accessible before starting the notebook.
- Access the Notebook: Click here to access the project notebook
- Open the notebook link provided.
- Once the notebook is open in your browser, you can run the cells sequentially to see the analysis process and results.
- To run a cell, click on it and then press the play button or use the shortcut
Shift + Enter
.
- If you wish to modify or experiment with the analysis, you can save a copy of the notebook to your Google Drive.
- Click
File
>Save a copy in Drive...
to create your own version of the notebook.
Contributions are welcome! Please feel free to fork the project, make changes, and submit pull requests if you have suggestions or improvements.