GreyMarket-Scanner

A study to scan grey markets to trace recent trends in antiquity sales, with an outlook to expand further for tracking illicit trafficked art in black markets.

The study aims to collect data from an antiquities (grey) market and analyse its recent trends in sales, which could be further taken up to understand contemporary cultural value. For the data source, after considering e-commerce sites dealing with antiquity sales, it was decided to embark on higher transacted markets like auction houses and their e-bids and sales of lots. Hence, Christie's proved more informative and easily accessible for data collection. The data was requested by web scrapping with Selenium and was cleaned and tokenised using NLTK.

High-Level Overview:

src

src/webscraping:
- webscraper.py is used to scrape data by applying the Selenium library. The scraped data, scrape_data.csv, is in the data folder.
- geckodriver.log is the driver that needed to be installed to run Selenium.
word_analysis

The only module here is word_counter.py, which does NLP operations on the retrieved data. The inputs of this module are the scraped parameters of the antiquities, e.g. period, object name, etc.
data

The folder contains the raw scraped data and .txt files used in the NLP analysis done in src/word_anlysis/word_counter.py. The scraped output .csv file contains object names, the value of the sale, the period of antiquity, and the link of the object. The data is then split and saved as individual parameters, e.g. objnames.txt, to perform NLP to find the most used terms.

Run order:

webscraper.py -> word_counter.py

Analysis:

The word counter generates the fifty most used words, revealing the culture and materials with the highest occurrence in the lot, popular among buyers. A dictionary is compiled from these most used terms to perform subsequent analysis. Accordingly, in the scraped database, two reference columns are added, one for the dynasty of the artefact and the other for its material. Consequently, using the dictionary, a partial textual match is detected on the object name's column by conditional if statements with wildcards to populate the reference columns. The data in these columns act as ordinal data indicators for the classification of data, which would further assist in visualisations.

Example of an analysis: A heat map for the sum of prices of artefacts of different dynastical cultures and material composition.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
src/webscraping		src/webscraping
word_analysis		word_analysis
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

src/webscraping

src/webscraping

word_analysis

word_analysis

README.md

README.md

Repository files navigation

GreyMarket-Scanner

A study to scan grey markets to trace recent trends in antiquity sales, with an outlook to expand further for tracking illicit trafficked art in black markets.

High-Level Overview:

`src`

`word_analysis`

`data`

Run order:

Analysis:

About

Releases

Packages

Languages

Guganesan-Ilavarasan/GreyMarket-Scanner

Folders and files

Latest commit

History

Repository files navigation

GreyMarket-Scanner

A study to scan grey markets to trace recent trends in antiquity sales, with an outlook to expand further for tracking illicit trafficked art in black markets.

High-Level Overview:

src

word_analysis

data

Run order:

Analysis:

About

Topics

Resources

Stars

Watchers

Forks

Languages

`src`

`word_analysis`

`data`