Skip to content

Guganesan-Ilavarasan/GreyMarket-Scanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GreyMarket-Scanner

A study to scan grey markets to trace recent trends in antiquity sales, with an outlook to expand further for tracking illicit trafficked art in black markets.


The study aims to collect data from an antiquities (grey) market and analyse its recent trends in sales, which could be further taken up to understand contemporary cultural value. For the data source, after considering e-commerce sites dealing with antiquity sales, it was decided to embark on higher transacted markets like auction houses and their e-bids and sales of lots. Hence, Christie's proved more informative and easily accessible for data collection. The data was requested by web scrapping with Selenium and was cleaned and tokenised using NLTK.

High-Level Overview:

  1. src

    src/webscraping:

    • webscraper.py is used to scrape data by applying the Selenium library. The scraped data, scrape_data.csv, is in the data folder.

    • geckodriver.log is the driver that needed to be installed to run Selenium.

  2. word_analysis

    The only module here is word_counter.py, which does NLP operations on the retrieved data. The inputs of this module are the scraped parameters of the antiquities, e.g. period, object name, etc.

  3. data

    The folder contains the raw scraped data and .txt files used in the NLP analysis done in src/word_anlysis/word_counter.py. The scraped output .csv file contains object names, the value of the sale, the period of antiquity, and the link of the object. The data is then split and saved as individual parameters, e.g. objnames.txt, to perform NLP to find the most used terms.

Run order:


webscraper.py -> word_counter.py

Analysis:

The word counter generates the fifty most used words, revealing the culture and materials with the highest occurrence in the lot, popular among buyers. A dictionary is compiled from these most used terms to perform subsequent analysis. Accordingly, in the scraped database, two reference columns are added, one for the dynasty of the artefact and the other for its material. Consequently, using the dictionary, a partial textual match is detected on the object name's column by conditional if statements with wildcards to populate the reference columns. The data in these columns act as ordinal data indicators for the classification of data, which would further assist in visualisations.


Example of an analysis: A heat map for the sum of prices of artefacts of different dynastical cultures and material composition.

image

About

A study to scan grey markets for tracing recent trends in antiquity sales to understand contemporary cultural values.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages