100.000 artwork links (just links) which are equivalent to ~30 days time period from the day that links were gathered. There're 50.000 artworks that were scraped and contain data, ~40.000+ unique (artwork from the same artist).
Dataset includes such columns:
- Role
- Company work at (if mentioned or extracted)
- Date artwork was posted
- Number of views
- Number of likes
- Number of comments
- Which software was used
- Which tags were used
- Artwork title
- Artwork URL
Kaggle dataset available here.
As you see the disclaimer, it's the first time I'm doing this. I want anyone who will be using this dataset to keep artists privacy by not using artist's email addresses in any way even though it's publicly available data published by them. Correct me if I said something wrong here.
The goal of this project was to better understand the process of gathering data, processing, cleaning, analyzing, and visualizing. Besides that, I wanted to understand what is the most popular software, tag, affiliation among artists.
While transitioning from 3D modeling to Data Analytics and Python Programming I decided to create a personal project to analyze something I have a close connection with. I really enjoyed seeing progression in the 3D world (games, feature films, etc).
requests
json
googlesheets api
selenium
regex
googlesheets
tableau
Note: following visualizations contains data bias. Not every tag, affiliation has taken to count due to the difficulties of data extraction, and the mistakes I made. Tableau public dashboard