Skip to content
InbarShirizly edited this page Jul 26, 2020 · 4 revisions

General

Welcome to the Data-mining-project wiki!

This document contains general information over important modules in the file - main.py and user.py

In this section there is a brief explanation of each file in the program:

Files

  • src:

    • init.py - when importing src file to main will initialize general process such set connection, create an engine, etc.
    • conf.py - configuration file, generate all-important values from json file and from the command line input
    • general.py - a general function that is being needed in multiple python files.
    • geo_location.py - file contains three nested functions that receive a general location string and retrieves a generic country and continent
    • logger.py - file contains class logger - for general logger format.
    • ORM.py - a file that defines schema using ORM - create tables and all relationships between the tables in the database.
    • user.py - includes the class User(UserScraper) - extracts the data for the individual user and add it to the relevant tables in the database.
    • user_analysis.py - includes the class UserAnalysis(Website) - create a generator of links for each individual user page.
    • user_scraper.py - includes the class UserScraper(Object) - Receives the users url and scrapes all the information into class variables
    • website.py - includes the class Website(object) - create soup of pages, find the last page, and create soups for main topic pages.
    • working_with_database.py - file that contains most of the function which CRUD with the database.
  • create_json_file.py - python file that generate the mining_constants.json

  • mining_constants.json - json format file contains the constants for all the program. the conf.py parse the JSON into class variables. The program will only use conf.py to import different variables.

  • requirements.txt - file with all the packages and dependencies of the project.

Clone this wiki locally