Skip to content

Daily roughly 1 billion private messages get selected & routed to the closest "operator" based on geolocation in China

Notifications You must be signed in to change notification settings

cookiemonster/WeChat

 
 

Repository files navigation

WeChat

Type Value
Source 211.159.163.137 - City: Beijing - Country: China - Organization: Tencent cloud computing
Description 1.081.231.257 captured WeChat dialogues containing 3.784.309.399 messages dated 18 March 2019 were automatically selected for review based on a keyword trigger. Not all the dialogues were in Chinese or only had GPS coordinates in China.

Project files:

Natural Language Processing tools

Research files

Update

  • January 2020: The Google Translate Toolkit service is down. The translated data is still attainable via Google Takeout. The process of refactoring all the salvaged components into a new research project has started.

  • December 2019: We gave up brute-forcing the VerCrypt container and moved on to the data which was saved in screenshots and the data which was uploaded to Google Translate Toolkit

  • June 2019: We got a small disk image of Ubuntu 18.04.2 desktop image, which was used to build the Jupyter Notebook files. After many attempts to find any evidence on the system image, it became clear nothing was stored here. We found symlink a Veracrypt container, but we did not find a password that would open a hidden container. A simple password 'password' did open an empty container.

  • May 2019: We have lost access to the original data source. Also, the server with research data is not accessible anymore. Even, the Chinese student who was helping in building the Jupyter Notebook files (creating stop word lists, tokenization, lemmatization, and phrase matching) based on the WeChat dialogues

About

Daily roughly 1 billion private messages get selected & routed to the closest "operator" based on geolocation in China

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%