Overview

Starter kit for people who scraping with ruby Docker

Construction environment

ruby
MySQL

Execution procedure

Copy this repository to clone
docker-compose build
docker-compose up -d
docker-compose exec play bash
bundle install --path .bundle
bundle exec ruby **.rb

Initial set code

crawler.rb

Scraping using net / http communication and HTML parser Nokogiri which is the first step in ruby's http communication (scraping of qiita my page of developer itaya)

crawler_mysql.rb

Describe a method to store the value taken by ruby in DB of mysql (store qiita's article title and URL)

table

CREATE TABLE `articles` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(255) NOT NULL DEFAULT '',
  `url` varchar(255) NOT NULL DEFAULT '',
  `created_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Setting when checking with Sequel Pro etc. from the host side

Host: 127.0.0.1
User name: root
Password: password
Port: 3307

crawler_selenium.rb

The above procedure was implemented using Headless Chrome.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
Readme.md		Readme.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Construction environment

Execution procedure

Initial set code

About

Releases

Packages

Contributors 2

Languages

itayayuichiro/ruby-scraping-docker

Folders and files

Latest commit

History

Repository files navigation

Overview

Construction environment

Execution procedure

Initial set code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages