Skip to content

Commit

Permalink
Bump version: 1.2.0 to 1.3.0 (#80)
Browse files Browse the repository at this point in the history
  • Loading branch information
my8100 committed Aug 4, 2019
1 parent 7ca184f commit a449dbf
Show file tree
Hide file tree
Showing 48 changed files with 77 additions and 38 deletions.
12 changes: 6 additions & 6 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,14 +69,14 @@ jobs:
- run:
name: Setup DATA_PATH
command: |
printf "\nDATA_PATH = '"$DATA_PATH"'\n" >> scrapydweb_settings_v8.py
printf "\nDATA_PATH = '"$DATA_PATH"'\n" >> scrapydweb_settings_v9.py
- when:
condition: <<parameters.use-sqlite>>
steps:
- run:
name: Set DATABASE_URL to sqlite
command: |
printf "\nDATABASE_URL = '"$DATABASE_URL"'\n" >> scrapydweb_settings_v8.py
printf "\nDATABASE_URL = '"$DATABASE_URL"'\n" >> scrapydweb_settings_v9.py
- when:
condition: <<parameters.use-postgresql>>
steps:
Expand All @@ -91,7 +91,7 @@ jobs:
name: Set DATABASE_URL to postgresql
command: |
# postgres://circleci@127.0.0.1:5432
printf "\nDATABASE_URL = '"$DATABASE_URL"'\n" >> scrapydweb_settings_v8.py
printf "\nDATABASE_URL = '"$DATABASE_URL"'\n" >> scrapydweb_settings_v9.py
- when:
condition: <<parameters.use-mysql>>
steps:
Expand Down Expand Up @@ -121,7 +121,7 @@ jobs:
name: Set DATABASE_URL to mysql
command: |
# mysql://user:passw0rd@127.0.0.1:3306
printf "\nDATABASE_URL = '"$DATABASE_URL"'\n" >> scrapydweb_settings_v8.py
printf "\nDATABASE_URL = '"$DATABASE_URL"'\n" >> scrapydweb_settings_v9.py
- run:
name: Install dependencies
Expand Down Expand Up @@ -168,8 +168,8 @@ jobs:
- run:
name: Generate report
command: |
touch scrapydweb_settings_v8.py
cat scrapydweb_settings_v8.py
touch scrapydweb_settings_v9.py
cat scrapydweb_settings_v9.py
echo $DATA_PATH
echo $DATABASE_URL
. venv/bin/activate
Expand Down
15 changes: 15 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
Release History
===============
[1.3.0](https://github.com/my8100/scrapydweb/issues?q=is%3Aclosed+milestone%3A1.3.0) (2019-08-04)
------------------
- New Features
- Add new pages Node Reports and Cluster Reports for aggregating jobs stats [(issue #72)](https://github.com/my8100/scrapydweb/issues/72)
- Improvements
- Adapt to [:link: *LogParser*](https://github.com/my8100/logparser) v0.8.2
- Add DATA_PATH option for customizing path to save program data [(issue #40)](https://github.com/my8100/scrapydweb/issues/40)
- Add DATABASE_URL option to support MySQL or PostgreSQL backend [(issue #42)](https://github.com/my8100/scrapydweb/issues/42)
- Support specify the latest version of Scrapy project in the Run Spider page [(issue #4)](https://github.com/my8100/scrapydweb/issues/4#issuecomment-475145676)
- Support specify default values of settings & arguments in the Run Spider page [(issue #55)](https://github.com/my8100/scrapydweb/issues/55)
- Others
- Update config file to scrapydweb_settings_v9.py
- Support continuous integration (CI) on [CircleCI](https://circleci.com/)


1.2.0 (2019-03-12)
------------------
- New Features
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,14 @@ and restart Scrapyd to make it visible externally.
```bash
pip install scrapydweb
```
:heavy_exclamation_mark: Note that you may need to execute `pip install -U pip` first in order to get the latest version of scrapydweb, or download the tar.gz file from https://pypi.org/project/scrapydweb/#files and get it installed via `pip install scrapydweb-x.x.x.tar.gz`
:heavy_exclamation_mark: Note that you may need to execute `python -m pip install --upgrade pip` first in order to get the latest version of scrapydweb, or download the tar.gz file from https://pypi.org/project/scrapydweb/#files and get it installed via `pip install scrapydweb-x.x.x.tar.gz`

- Use git:
```bash
pip install --upgrade git+https://github.com/my8100/scrapydweb.git
```
Or:
```bash
git clone https://github.com/my8100/scrapydweb.git
cd scrapydweb
python setup.py install
Expand Down
6 changes: 5 additions & 1 deletion README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,14 @@
```bash
pip install scrapydweb
```
:heavy_exclamation_mark: 如果 pip 安装结果不是最新版本的 scrapydweb,请先执行`pip install -U pip`,或者前往 https://pypi.org/project/scrapydweb/#files 下载 tar.gz 文件并执行安装命令 `pip install scrapydweb-x.x.x.tar.gz`
:heavy_exclamation_mark: 如果 pip 安装结果不是最新版本的 scrapydweb,请先执行`python -m pip install --upgrade pip`,或者前往 https://pypi.org/project/scrapydweb/#files 下载 tar.gz 文件并执行安装命令 `pip install scrapydweb-x.x.x.tar.gz`

- 通过 git:
```bash
pip install --upgrade git+https://github.com/my8100/scrapydweb.git
```
或:
```bash
git clone https://github.com/my8100/scrapydweb.git
cd scrapydweb
python setup.py install
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ APScheduler>=3.5.3
flask>=1.0.2
flask-compress>=1.4.0
Flask-SQLAlchemy>=2.3.2
logparser==0.8.1
logparser==0.8.2
requests>=2.21.0
setuptools>=40.6.3
six>=1.12.0
Expand Down
5 changes: 3 additions & 2 deletions scrapydweb/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,8 @@ def register_view(view, endpoint, url_defaults_list):
register_view(DeployUploadView, 'deploy.upload', [('deploy/upload', None)])
register_view(DeployXhrView, 'deploy.xhr', [('deploy/xhr/<eggname>/<project>/<version>', None)])

from .views.operations.schedule import ScheduleView, ScheduleCheckView, ScheduleRunView, ScheduleXhrView, ScheduleTaskView
from .views.operations.schedule import (ScheduleView, ScheduleCheckView, ScheduleRunView,
ScheduleXhrView, ScheduleTaskView)
register_view(ScheduleView, 'schedule', [
('schedule/<project>/<version>/<spider>', None),
('schedule/<project>/<version>', dict(spider=None)),
Expand Down Expand Up @@ -271,7 +272,7 @@ def handle_template_context(app):
STATIC = 'static'
VERSION = 'v' + __version__.replace('.', '')
# MUST be commented out for released version
VERSION = 'v121dev'
# VERSION = 'v131dev'

@app.context_processor
def inject_variable():
Expand Down
2 changes: 1 addition & 1 deletion scrapydweb/__version__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# coding: utf-8

__title__ = 'scrapydweb'
__version__ = '1.2.0'
__version__ = '1.3.0'
__author__ = 'my8100'
__author_email__ = 'my8100@gmail.com'
__url__ = 'https://github.com/my8100/scrapydweb'
Expand Down
3 changes: 3 additions & 0 deletions scrapydweb/default_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,11 +61,14 @@
# to the Scrapyd server.
# e.g. '127.0.0.1:6800' or 'localhost:6801', do not forget the port number.
LOCAL_SCRAPYD_SERVER = ''

# Enter the directory when you run Scrapyd, run the command below
# to find out where the Scrapy logs are stored:
# python -c "from os.path import abspath, isdir; from scrapyd.config import Config; path = abspath(Config().get('logs_dir')); print(path); print(isdir(path))"
# Check out https://scrapyd.readthedocs.io/en/stable/config.html#logs-dir for more info.
# e.g. 'C:/Users/username/logs' or '/home/username/logs'
LOCAL_SCRAPYD_LOGS_DIR = ''

# The default is False, set it to True to automatically run LogParser as a subprocess at startup.
# Note that you can run the LogParser service separately via command 'logparser' as you like.
# Run 'logparser -h' to find out the config file of LogParser for more advanced settings.
Expand Down
4 changes: 3 additions & 1 deletion scrapydweb/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,9 @@ def load_custom_settings(config):
"Then add your SCRAPYD_SERVERS in the config file and restart scrapydweb.\n".format(
file=SCRAPYDWEB_SETTINGS_PY))
else:
sys.exit("\nThe config file '{file}' has been copied to current working directory.\n"
sys.exit("\nATTENTION:\nYou may encounter ERROR if there are any timer tasks added in v1.2.0,\n"
"and you have to restart scrapydweb and manually restart the stopped tasks.\n"
"\nThe config file '{file}' has been copied to current working directory.\n"
"Please add your SCRAPYD_SERVERS in the config file and restart scrapydweb.\n".format(
file=SCRAPYDWEB_SETTINGS_PY))

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
6 changes: 5 additions & 1 deletion scrapydweb/templates/scrapydweb/cluster_reports.html
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@
{% endblock %}

{% block body %}
<h2>{{ selected_nodes|length }} Reports of /{{ project }}/{{ spider }}/{{ job }}/</h2>
<h2>
{% if selected_nodes %}<a class="link" target="_blank" href="{{ url_report }}">{% endif %}
{{ selected_nodes|length }} Reports of /{{ project }}/{{ spider }}/{{ job }}/
{% if selected_nodes %}</a>{% endif %}
</h2>

<form method="post" enctype="multipart/form-data" hidden></form>

Expand Down
2 changes: 1 addition & 1 deletion scrapydweb/utils/check_app_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ def check_connectivity(server):
(_group, _ip, _port, _auth) = server
try:
url = 'http://%s:%s' % (_ip, _port)
r = session.get(url, auth=_auth, timeout=30)
r = session.get(url, auth=_auth, timeout=10)
assert r.status_code == 200, "%s got status_code %s" % (url, r.status_code)
except Exception as err:
logger.error(err)
Expand Down
7 changes: 5 additions & 2 deletions scrapydweb/utils/setup_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ def setup_database(database_url, database_path):
'jobs': '/'.join([database_url, DB_JOBS])
}
else:
# db names for backward compatibility
APSCHEDULER_DATABASE_URI = 'sqlite:///' + '/'.join([database_path, 'apscheduler.db'])
# http://flask-sqlalchemy.pocoo.org/2.3/binds/#binds
SQLALCHEMY_DATABASE_URI = 'sqlite:///' + '/'.join([database_path, 'timer_tasks.db'])
Expand Down Expand Up @@ -80,7 +81,8 @@ def setup_mysql(username, password, host, port):
"""
ModuleNotFoundError: No module named 'MySQLdb'
pip install mysqlclient
Python 2: pip install mysqlclient -> MySQLdb/_mysql.c(29) : fatal error C1083: Cannot open include file: 'mysql.h': No such file or directory
Python 2: pip install mysqlclient -> MySQLdb/_mysql.c(29) :
fatal error C1083: Cannot open include file: 'mysql.h': No such file or directory
https://stackoverflow.com/questions/51294268/pip-install-mysqlclient-returns-fatal-error-c1083-cannot-open-file-mysql-h
https://www.lfd.uci.edu/~gohlke/pythonlibs/#mysqlclient
pip install "path to the downloaded mysqlclient.whl file"
Expand Down Expand Up @@ -148,7 +150,8 @@ def setup_postgresql(username, password, host, port):
# creating-utf-8-database-in-postgresql-on-windows10

# cur.execute("CREATE DATABASE %s ENCODING 'UTF8' LC_COLLATE 'en-US' LC_CTYPE 'en-US'" % dbname)
# psycopg2.DataError: new collation (en-US) is incompatible with the collation of the template database (Chinese (Simplified)_People's Republic of China.936)
# psycopg2.DataError: new collation (en-US) is incompatible with the collation of the template database
# (Chinese (Simplified)_People's Republic of China.936)
# HINT: Use the same collation as in the template database, or use template0 as template.
try:
cur.execute("CREATE DATABASE %s ENCODING 'UTF8' LC_COLLATE 'en_US.UTF-8' LC_CTYPE 'en_US.UTF-8'" % dbname)
Expand Down
2 changes: 1 addition & 1 deletion scrapydweb/vars.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

PYTHON_VERSION = '.'.join([str(n) for n in sys.version_info[:3]])
PY2 = sys.version_info.major < 3
SCRAPYDWEB_SETTINGS_PY = 'scrapydweb_settings_v8.py'
SCRAPYDWEB_SETTINGS_PY = 'scrapydweb_settings_v9.py'
try:
custom_settings_module = importlib.import_module(os.path.splitext(SCRAPYDWEB_SETTINGS_PY)[0])
except ImportError:
Expand Down
4 changes: 2 additions & 2 deletions scrapydweb/views/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,10 +89,10 @@ def handle_result(self):
elif self.opt == 'liststats':
if self.js.get('logparser_version') != self.LOGPARSER_VERSION:
if self.project and self.version_spider_job: # 'List Stats' in the Servers page
tip = "'pip install -U logparser' to update LogParser to v%s" % self.LOGPARSER_VERSION
tip = "'pip install --upgrade logparser' to update LogParser to v%s" % self.LOGPARSER_VERSION
self.js = dict(status=self.OK, tip=tip)
else: # XMLHttpRequest in the Jobs page; POST in jobs.py
self.js['tip'] = ("'pip install -U logparser' on host '%s' and run command 'logparser' "
self.js['tip'] = ("'pip install --upgrade logparser' on host '%s' and run command 'logparser' "
"to update LogParser to v%s") % (self.SCRAPYD_SERVER, self.LOGPARSER_VERSION)
self.js['status'] = self.ERROR
elif self.project and self.version_spider_job: # 'List Stats' in the Servers page
Expand Down
7 changes: 4 additions & 3 deletions scrapydweb/views/dashboard/jobs.py
Original file line number Diff line number Diff line change
Expand Up @@ -367,8 +367,9 @@ def handle_jobs_without_db(self):
else:
if job['finish']:
self.finished_jobs.append(job)
job['url_multinode_run'] = url_for('servers', node=self.node, opt='schedule', project=job['project'],
version_job=self.DEFAULT_LATEST_VERSION, spider=job['spider'])
job['url_multinode_run'] = url_for('servers', node=self.node, opt='schedule',
project=job['project'], version_job=self.DEFAULT_LATEST_VERSION,
spider=job['spider'])
job['url_schedule'] = url_for('schedule', node=self.node, project=job['project'],
version=self.DEFAULT_LATEST_VERSION, spider=job['spider'])
job['url_start'] = url_for('api', node=self.node, opt='start', project=job['project'],
Expand All @@ -384,7 +385,7 @@ def handle_jobs_without_db(self):
job['url_stats'] = url_for('log', node=self.node, opt='stats', project=job['project'], ui=self.UI,
spider=job['spider'], job=job['job'], job_finished=job_finished)
job['url_clusterreports'] = url_for('clusterreports', node=self.node, project=job['project'],
spider=job['spider'], job=job['job'])
spider=job['spider'], job=job['job'])
# <a href='/items/demo/test/2018-10-12_205507.jl'>Items</a>
m = re.search(HREF_PATTERN, job['href_items'])
if m:
Expand Down
8 changes: 4 additions & 4 deletions scrapydweb/views/files/log.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@
job_finished_key_dict = defaultdict(OrderedDict)
# For /log/report/
job_finished_report_dict = defaultdict(OrderedDict)
REPORT_KEYS_SET = set(['from_memory', 'status', 'pages', 'items', 'shutdown_reason', 'finish_reason',
'runtime', 'first_log_time', 'latest_log_time', 'log_categories', 'latest_matches'])
REPORT_KEYS_SET = {'from_memory', 'status', 'pages', 'items', 'shutdown_reason', 'finish_reason', 'runtime',
'first_log_time', 'latest_log_time', 'log_categories', 'latest_matches'}


# http://flask.pocoo.org/docs/1.0/api/#flask.views.View
Expand Down Expand Up @@ -110,7 +110,7 @@ def __init__(self):
self.email_content_kwargs = {}
self.flag = ''

self.jobs_to_keep = self.JOBS_FINISHED_JOBS_LIMIT or 1000
self.jobs_to_keep = self.JOBS_FINISHED_JOBS_LIMIT or 200

def dispatch_request(self, **kwargs):
if self.report_logparser:
Expand Down Expand Up @@ -205,7 +205,7 @@ def request_stats_by_logparser(self):
"Or wait until LogParser parses the log. ") % self.SCRAPYD_SERVER, self.WARN)
return
elif js.get('logparser_version') != self.LOGPARSER_VERSION:
msg = "'pip install -U logparser' on host '%s' to update LogParser to v%s" % (
msg = "'pip install --upgrade logparser' on host '%s' to update LogParser to v%s" % (
self.SCRAPYD_SERVER, self.LOGPARSER_VERSION)
self.logger.warning(msg)
flash(msg, self.WARN)
Expand Down
6 changes: 3 additions & 3 deletions scrapydweb/views/operations/execute_task.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
EXTRACT_URL_SERVER_PATTERN = re.compile(r'//(.+?:\d+)')


class TaskExecuter(object):
class TaskExecutor(object):

def __init__(self, task_id, task_name, url_scrapydweb, url_schedule_task, url_delete_task_result,
auth, selected_nodes):
Expand Down Expand Up @@ -159,14 +159,14 @@ def execute_task(task_id):
username = metadata.get('username', '')
password = metadata.get('password', '')
url_delete_task_result = metadata.get('url_delete_task_result', '/1/tasks/xhr/delete/1/1/')
task_executer = TaskExecuter(task_id=task_id,
task_executor = TaskExecutor(task_id=task_id,
task_name=task.name,
url_scrapydweb=metadata.get('url_scrapydweb', 'http://127.0.0.1:5000'),
url_schedule_task=metadata.get('url_schedule_task', '/1/schedule/task/'),
url_delete_task_result=url_delete_task_result,
auth=(username, password) if username and password else None,
selected_nodes=json.loads(task.selected_nodes))
try:
task_executer.main()
task_executor.main()
except Exception:
apscheduler_logger.error(traceback.format_exc())
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
exec(f.read(), about)

with io.open("README.md", 'r', encoding='utf-8') as f:
long_description = re.sub(r':\w+:\s', '', f.read()) # Remove emojis
long_description = re.sub(r':\w+:\s', '', f.read()) # Remove emoji


setup(
Expand All @@ -37,7 +37,7 @@
"flask >= 1.0.2", # May 2, 2018
"flask-compress >= 1.4.0", # Jan 5, 2017
"Flask-SQLAlchemy >= 2.3.2", # Oct 11, 2017
"logparser == 0.8.1",
"logparser == 0.8.2",
"requests >= 2.21.0", # Dec 10, 2018
"setuptools >= 40.6.3", # Dec 11, 2018
"six >= 1.12.0", # Dec 10, 2018
Expand Down
2 changes: 1 addition & 1 deletion tests/test_aa_logparser.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ def rename(name, restore=False):
replace_file_content(app.config['DEMO_JSON_PATH'], old, new)
req(app, client, view='log', kws=kws,
ins=["Mismatching logparser_version 0.0.0 in local stats",
"pip install -U logparser", "Using local logfile:", tab])
"pip install --upgrade logparser", "Using local logfile:", tab])
replace_file_content(app.config['DEMO_JSON_PATH'], new, old)

# delete ScrapydWeb_demo.json in logs
Expand Down
4 changes: 2 additions & 2 deletions tests/test_reports.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def test_cluster_reports(app, client):
spider=cst.SPIDER, job=cst.JOBID)
url_redirect_to_clusterreports = url_for('clusterreports', node=1, project=cst.PROJECT,
spider=cst.SPIDER, job=cst.JOBID)
ins = ['<h2>0 Reports of ////</h2>', '>Select a job</el-button>', url_jobs, 'selected_nodes: [],']
ins = ['0 Reports of ////', '>Select a job</el-button>', url_jobs, 'selected_nodes: [],']
nos = ['>Select nodes</el-button>']
req(app, client, view='clusterreports', kws=dict(node=1), ins=ins)

Expand All @@ -37,7 +37,7 @@ def test_cluster_reports(app, client):
'1': 'on',
'2': 'on',
}
ins[0] = '<h2>%s Reports of /%s/%s/%s/</h2>' % (len(data), cst.PROJECT, cst.SPIDER, cst.JOBID)
ins[0] = '%s Reports of /%s/%s/%s/' % (len(data), cst.PROJECT, cst.SPIDER, cst.JOBID)
ins[-1] = 'selected_nodes: [1, 2],'
ins.extend(nos)
ins.append(url_servers)
Expand Down
Loading

0 comments on commit a449dbf

Please sign in to comment.