🎥 Film Analysis Project: Uncovering Cinematic Insights

🎯 Objective

The aim of this project was to delve deep into the realm of cinema and derive meaningful insights through the lens of data journalism. My mission was to analyze a dataset consisting of films, critics, and directors, originally sourced from a BBC article. The goal was to identify trends and narratives in film critique and production spanning the last 15 years, offering a unique perspective on the global film industry.

🛠️ Process

Data Collection

The project commenced with the complex task of data collection. Leveraging Python libraries such as BeautifulSoup and requests, I meticulously scraped a comprehensive list of movies, critics, and their affiliations from the BBC's webpage. This phase required acute attention to detail, especially due to the minimalistic layout of the source page, which presented its own set of challenges in data isolation.

Data Structuring

Following data acquisition, the next phase involved structuring the raw HTML data into a format amenable for analysis. I transformed the data into Python lists and dictionaries and subsequently into a structured pandas DataFrame. This process was critical in preparing the data for the subsequent analysis phase.

Regular Expressions

I extensively utilized regular expressions to parse and extract essential information from the dataset, including critic names, affiliations, countries, movie titles, directors, and release years.

Additional Data Enrichment

Parallel to the initial data scraping, I expanded the data collection to Wikipedia. This involved scraping director biographical details, thereby enriching the dataset with contextual information about their birthplaces and dates.

Data Visualization with Mapbox API

In an effort to bring the data to life visually, I integrated the Mapbox API for advanced geospatial data representations, allowing me to create interactive and engaging maps that vividly illustrate the global distribution and impact of film directors and critics.

🔎 Key Findings

🇺🇸 The US, as expected, had the most number of critically acclaimed directors as well as the highest number of votes from critics
🇨🇳 Movies directed by Chinese directors had the second-highest number of votes - 98 in total
🇮🇷 A surprising finding was that Iranian movies were very popular with critics. Six directors were critically acclaimed, with Asghar Farhadi leading the pack with 31 votes

📖 Key Learnings

Data Scraping and Cleaning: This project significantly enhanced my skills in web scraping and the intricacies of data cleaning and preparation.
Regular Expressions Mastery: Through extensive use, I developed a proficiency in using regex for effective data extraction.
Pandas Proficiency: The process of data transformation using pandas was invaluable, improving my ability to manipulate and analyze large datasets.
Creative Problem-Solving: Each challenge encountered necessitated unique and creative solutions, underscoring the importance of adaptability and innovation in data journalism.

💭 Reflections and Future Directions

In retrospect, several areas could be refined in future projects:

Automation and Efficiency: Automating certain aspects of the scraping process would enhance efficiency.
Advanced Data Analysis: Implementing more sophisticated statistical methods could yield deeper insights.
Enhanced Visual Storytelling: Incorporating advanced data visualization techniques would further augment the storytelling aspect of the data.

🎬 Conclusion

This project represents a comprehensive endeavor in data journalism, blending technical data processing with narrative storytelling. It stands as a testament to the power of data in revealing hidden stories within the film industry.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
BBC-Dataframes.ipynb		BBC-Dataframes.ipynb
BBC_project_code_steps.ipynb		BBC_project_code_steps.ipynb
README.md		README.md
backup_BBC1.csv		backup_BBC1.csv
custom.geo (1).json		custom.geo (1).json
geo-data12-13.js		geo-data12-13.js
movie_data.csv		movie_data.csv
movie_data_updated.csv		movie_data_updated.csv
top_director_cob_3.csv		top_director_cob_3.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎥 Film Analysis Project: Uncovering Cinematic Insights

🎯 Objective

🛠️ Process

Data Collection

Data Structuring

Regular Expressions

Additional Data Enrichment

Data Visualization with Mapbox API

🔎 Key Findings

📖 Key Learnings

💭 Reflections and Future Directions

🎬 Conclusion

About

Releases

Packages

Languages

radhika3558/databases-mapbox-project

Folders and files

Latest commit

History

Repository files navigation

🎥 Film Analysis Project: Uncovering Cinematic Insights

🎯 Objective

🛠️ Process

Data Collection

Data Structuring

Regular Expressions

Additional Data Enrichment

Data Visualization with Mapbox API

🔎 Key Findings

📖 Key Learnings

💭 Reflections and Future Directions

🎬 Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages