Name		Name	Last commit message	Last commit date
parent directory ..
model		model
README.md		README.md
adab.py		adab.py
requirements.txt		requirements.txt

README.md

adaboost

Implementation of adaboost algorithm to classify gender, based on comments from instagram media(s).

Setup

Assuming you have virtualenv installed:

virtualenv venv && source venv/bin/activate
pip install -r requirements.txt
./ada.py --help

$ ./ada.py -h
usage: ada.py [-h] [-l LIMIT] [-r MALE_FEMALE_RATIO] [-c] [-g GAMMA]
              [-k KERNEL]

optional arguments:
  -h, --help            show this help message and exit
  -l LIMIT, --limit LIMIT
                        Limit processed data, default ratio is equal male and
                        female comments
  -r MALE_FEMALE_RATIO, --male-female-ratio MALE_FEMALE_RATIO
                        Male to female ratio
  -c, --cache           Cache processed raw data
  -g GAMMA, --gamma GAMMA
                        Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’
  -k KERNEL, --kernel KERNEL
                        Specifies the kernel type to be used in the algorithm

deactivate

Words Blacklist

Put blacklisted words in data/blacklist.txt (one word per line)

They will be read at the beginning of the program.

Example

$ ./ada.py -l 2000 -r 0.3
Running AdaBoost Classifier
Reading blacklist words file
Reading raw gender-comment data
Loaded 23499 male and 51030 female comments
Limiting male and female comments to 600 male and 1400 female (2000 total)
Total of 5162 words found

Processing 2000 raw gender-comment data: [#######################################-]

Running tests
=============
Test-1
> Training model...
> Predicting test data...
> Accuracy: 68.80%

Test-2
> Training model...
> Predicting test data...
> Accuracy: 70.40%

Test-3
> Training model...
> Predicting test data...
> Accuracy: 74.40%

Test-4
> Training model...
> Predicting test data...
> Accuracy: 70.80%

Test-5
> Training model...
> Predicting test data...
> Accuracy: 70.00%

Test-6
> Training model...
> Predicting test data...
> Accuracy: 66.40%

Test-7
> Training model...
> Predicting test data...
> Accuracy: 68.80%

Test-8
> Training model...
> Predicting test data...
> Accuracy: 71.60%

=====================================
Avg. Accuracy: 70.15%
Elapsed time: 278.01s

Reference

https://scikit-learn.org/stable/modules/ensemble.html#adaboost

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adab

adab

README.md

adaboost

Setup

Words Blacklist

Example

Reference

Files

adab

Directory actions

More options

Directory actions

More options

Latest commit

History

adab

Folders and files

parent directory

README.md

adaboost

Setup

Words Blacklist

Example

Reference