Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
db		db
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
javhoo_actresses.py		javhoo_actresses.py
javhoo_censored.py		javhoo_censored.py
javhoo_uncensored.py		javhoo_uncensored.py
javhoo_vr.py		javhoo_vr.py

Repository files navigation

javhoo_actresses

What's javhoo_actresses used for?

In order to collect infomation about japanese Pornstars and analyse them, I started this project. Javhoo.com contains data that interests me, javhoo_actresses will extract data from downloaded HTML files, and save them to sqlite DB.

Update!

Kaggle Dataset:japanese-pornstars-and-adult-videos you can publish a new kernel!
Add metadata of Japanese Censored,Uncensored and VR porn videos in javhoo_actresses/db/javhooDB.db.

How to use

First, you need to fetch html pages from javhoo.com/actresses using cURL. Currently, there are 212 pages about Japanse Pornstars on Javhoo.com. Therefore, you need to download 212 pages.You can paste this command to your bash shell.

curl https://www.javhoo.com/actresses/page/[1-212] > javhoo_actresses_212pages.html

OK, now you have this file javhoo_actresses_212pages.html, then you need to modify configurations in javhoo_actresses.py. javhooDB.db is where you restore data from extracted html pages. In the beginning, it should contains nothing.

jactress_dict ={
'html_path':'/path/to/your/javhoo_actresses_212pages.html',
'sqlite3db_path':'/path/to/your/javhooDB.db'
}

After that, you can run this python script.

python javhoo_actresses.py

Here is the result.

About

crawl profiles of Japanese PornStars from Javhoo.com

crawler pornstar uncensored japanese-av censored-av japanese-pornstars javhoo

GPL-3.0 license

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%