Skip to content

Webscrapper to extract Whatsapp messages from a user/group conversation.

License

Notifications You must be signed in to change notification settings

Wh014M/Whatsapp-Webscrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Whatsapp Webscrapper

Webscrapper to extract Whatsapp messages from a user/group conversation.


Requirements


Usage

  1. Download the webdrivers and place them in the folder drivers.

    • Chrome driver must be named to: chromedriver.exe
    • Firefox driver must be named to: geckodriver.exe
  2. Configure the browser paths in settings.conf (only Firefox, Chrome or Brave compatible).
    (path examples inside the file, modify with your own configuration)

    browser : Browser name to use. Supported values: chrome/firefox. If you are using Brave, value is 'chrome'
    binary : Path where the browser executable is located.
    profile_path : Path where the browser user profile is located.

  3. Add the user(s) or group(s) in the contacts.json file from which you want to extract messages.
    last_message means from which message, all subsequent messages will be collected.
    One of the limitations of the program is that the first time we have to introduce a very old message, for this we have to manually navigate through WhatsApp to extract it.

  4. Running the program.

    python main.py

    The first time the browser window will open. We log-in to WhatsApp (QR code) and once the session is opened, we close the browser.
    Run the program again, a new window will be opened automatically, and it will start to search the contacts and extract the messages.


Data Export

The program will automatically create a folder in the root of the project called data, and within it all messages will be exported in csv format with the following columns:

  • Date
  • Hour
  • User
  • Message
  • Emojis
  • Quoted_Message

The file naming format is as follows: [contact_name/group] [date and hour].csv, for example:

  • John Smith 2021-04-03 21.25.38.csv
  • Family Group 2021-05-03 23.50.31.csv

Settings

Firefox profile_path

In Firefox, we can find our profile in:

  • Linux: /home/user/.mozilla/firefox/xxxxxxxx.default
  • Windows: C:/Users/[user_name]/AppData/Roaming/Mozilla/Firefox/Profiles/xxxxxxxx.default

Once the path is located, we add it to our settings.conf file in the section profile_path.

Chrome/Brave profile_path

For Chrome or Brave, open the browser and type chrome://version/ in the search bar. There you will find the path of the user profile.
Copy the path and add it to settings.conf in the section profile_path.


Program limitations

  • If the program finds a message equal to last_message, the program will take it as the last extraction point, regardless of whether that message was not our last starting point.




About

Webscrapper to extract Whatsapp messages from a user/group conversation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages