You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before I start I just want to say that you all have done a great job developing this project. I love gerapy. I will probably start contributing to the project. I will try to document this as well as I can so it can be helpful to others.
Describe the bug
I have a scrapy project which runs perfectly fine in terminal using the following command:
scrapy crawl examplespider
However, when I schedule it in a task and run it on my local scrapyd client it runs but immediately closes. I don't know why it opens and closes without doing anything. Throws no errors. I think it's a config file issue. When I view the results of the job it shows the following:
Look for active (running) and navigate to http://your.pub.ip.add:8000 or http://localhost:8000 or http://127.0.0.1:8000 to verify that it is running. Reboot the instance to verify that the services are running on system startup.
5. Log in and create a client for the local scrapyd service. Use IP 127.0.0.1 and Port 6800. No Auth. Save it as "Local" or "Scrapyd"
6. Create a project. Select Clone. For testing I used the following github scrapy project: https://github.com/eneiromatos/NebulaEmailScraper (actually a pretty nice starter project). Save the project. Build the project. Deploy the project. (If you get an error when deploying make sure to be running in the virtual env, you might need to reboot).
7. Create a task. Make sure the project name and spider name matches what is in the scrapy.cfg and examplespider.py files and save the task. Schedule the task. Run the task
Traceback
See logs above ^^^
Expected behavior
It should run for at least 5 minutes and output to a file called emails.json in the project root folder (the folder with scrapy.cfg file)
Screenshots
I can upload screenshots if requested.
Environment (please complete the following information):
OS: AWS Ubuntu 20.04
Browser Firefox
Python Version 3.8
Gerapy Version 0.9.11 (latest)
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Before I start I just want to say that you all have done a great job developing this project. I love gerapy. I will probably start contributing to the project. I will try to document this as well as I can so it can be helpful to others.
Describe the bug
I have a scrapy project which runs perfectly fine in terminal using the following command:
scrapy crawl examplespider
However, when I schedule it in a task and run it on my local scrapyd client it runs but immediately closes. I don't know why it opens and closes without doing anything. Throws no errors. I think it's a config file issue. When I view the results of the job it shows the following:
In the logs it shows the following:
/home/ubuntu/env/scrape/bin/logs/examplescraper/examplespider
/home/ubuntu/gerapy/logs
To Reproduce
Steps to reproduce the behavior:
paste the following:
Issue the following commands:
It should say: active (running)
Create a script to run gerapy as a systemd service
Paste the following:
Give this file execute permissions
sudo chmod +x runserve-gerapy.sh
Navigate back to systemd and create a service to run the runserve-gerapy.sh
Paste the following:
Again issue the following:
Look for active (running) and navigate to http://your.pub.ip.add:8000 or http://localhost:8000 or http://127.0.0.1:8000 to verify that it is running. Reboot the instance to verify that the services are running on system startup.
5. Log in and create a client for the local scrapyd service. Use IP 127.0.0.1 and Port 6800. No Auth. Save it as "Local" or "Scrapyd"
6. Create a project. Select Clone. For testing I used the following github scrapy project: https://github.com/eneiromatos/NebulaEmailScraper (actually a pretty nice starter project). Save the project. Build the project. Deploy the project. (If you get an error when deploying make sure to be running in the virtual env, you might need to reboot).
7. Create a task. Make sure the project name and spider name matches what is in the scrapy.cfg and examplespider.py files and save the task. Schedule the task. Run the task
Traceback
See logs above ^^^
Expected behavior
It should run for at least 5 minutes and output to a file called emails.json in the project root folder (the folder with scrapy.cfg file)
Screenshots
I can upload screenshots if requested.
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: