Skip to content

Commit

Permalink
Update README, CHANGELOG and template
Browse files Browse the repository at this point in the history
  • Loading branch information
vifreefly committed Jan 30, 2019
1 parent a4ec725 commit 340257a
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 5 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
@@ -1,4 +1,12 @@
# CHANGELOG
## 1.4.0
### New
* Add `encoding` config option (see [All available config options](https://github.com/vifreefly/kimuraframework#all-available-config-options))
* Validate url before processing a request (Base#request_to)

### Fixes
* Fix console command bug (see [issue 21](https://github.com/vifreefly/kimuraframework/issues/21))

## 1.3.2
### Fixes
* In the project template, set Ruby version as >= 2.5 (before was hard-coded to 2.5.1)
Expand Down
11 changes: 7 additions & 4 deletions README.md
Expand Up @@ -10,15 +10,12 @@
> * The code was massively refactored for a [support](#using-kimurai-inside-existing-ruby-application) to run spiders multiple times from inside a single process. Now it's possible to run Kimurai spiders using background jobs like Sidekiq.
> * `require 'kimurai'` doesn't require any gems except Active Support. Only when a particular spider [starts](#crawl-method), Capybara will be required with a specific driver.
> * Although Kimurai [extends](lib/kimurai/capybara_ext) Capybara (all the magic happens inside [extended](lib/kimurai/capybara_ext/session.rb) `Capybara::Session#visit` method), session instances which were created manually will behave normally.
> * No spaghetti code with `case/when/end` blocks anymore. All drivers [were extended](lib/kimurai/capybara_ext) to support unified methods for cookies, proxies, headers, etc.
> * `selenium_url_to_set_cookies` @config option don't need anymore if you're use Selenium-like engine with custom cookies setting.
> * Small changes in design (check the readme again to see what was changed)
> * Stats database with a web dashboard were removed
> * Again, massive refactor. Code now looks much better than it was before.
<br>

> Note: this readme is for `1.3.2` gem version. CHANGELOG [here](CHANGELOG.md).
> Note: this readme is for `1.4.0` gem version. CHANGELOG [here](CHANGELOG.md).
Kimurai is a modern web scraping framework written in Ruby which **works out of box with Headless Chromium/Firefox, PhantomJS**, or simple HTTP requests and **allows to scrape and interact with JavaScript rendered websites.**

Expand Down Expand Up @@ -1592,6 +1589,12 @@ end
# Format: same like for `skip_request_errors` option.
retry_request_errors: [Net::ReadTimeout],

# Handle page encoding while parsing html response using Nokogiri. There are two modes:
# Auto (`:auto`) (try to fetch correct encoding from <meta http-equiv="Content-Type"> or <meta charset> tags)
# Set required encoding manually, example: `encoding: "GB2312"` (Set required encoding manually)
# Default this option is unset.
encoding: nil,

# Restart browser if one of the options is true:
restart_if: {
# Restart browser if provided memory limit (in kilobytes) is exceeded (works for all engines)
Expand Down
2 changes: 1 addition & 1 deletion lib/kimurai/template/Gemfile
Expand Up @@ -4,7 +4,7 @@ git_source(:github) { |repo| "https://github.com/#{repo}.git" }
ruby '>= 2.5'

# Framework
gem 'kimurai', '~> 1.0'
gem 'kimurai', '~> 1.4'

# Require files in directory and child directories recursively
gem 'require_all'
Expand Down
6 changes: 6 additions & 0 deletions lib/kimurai/template/spiders/application_spider.rb
Expand Up @@ -100,6 +100,12 @@ class ApplicationSpider < Kimurai::Base
# Format: same like for `skip_request_errors` option.
# retry_request_errors: [Net::ReadTimeout],

# Handle page encoding while parsing html response using Nokogiri. There are two modes:
# Auto (`:auto`) (try to fetch correct encoding from <meta http-equiv="Content-Type"> or <meta charset> tags)
# Set required encoding manually, example: `encoding: "GB2312"` (Set required encoding manually)
# Default this option is unset.
# encoding: nil,

# Restart browser if one of the options is true:
restart_if: {
# Restart browser if provided memory limit (in kilobytes) is exceeded (works for all engines)
Expand Down

0 comments on commit 340257a

Please sign in to comment.