This project is intended to build connection graphs of ultra-high net worth people in Canada.
- Run the namelist crawler to get the names and basic profile information from yahoo finance. This crawler is modified from yahoo_finance_scrap.
- Send a name list to search engine crawler. (When quotation marks are added to the key words, search engine will only return the results that exactly contain the targets that you need, meaning that the search amount can represent the connection strength between 2 targets. In this project I use bing to get the search amounts, and the reason is that the search results of bing are more accurate than the ones of google, and the anti-crawler algorithm of google is too powerful). proxy ips and random pseudo headers are applied, and the crawler can be run on AWS instance.
- Draw the graphs with pyecharts. You will need to prepare 3 variables: categories, links, and nodes.
- 2 different types of graphs are provided: circular and float
- The strength level is defined by the total search amount of a people, and the size of a dot is defined by the total number of connections with other people.