New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: how to store data in json #137
Comments
For portiacrawl if you add |
Sorry, maybe I didn't express my problem clearly. In my template I use variants to scrape items: and it works well when I click continue browsing or turn to a similiar page: So I start to run the spider in console: Then I can see the data I want in my console, but when the spider finished, only the items in http://bbs.chinadaily.com.cn/forum-83-2.html are stored: I don't know what I'm missing... Thanks very much!!!;) |
What are your start urls? |
http://bbs.chinadaily.com.cn/forum-83-2.html |
Which links are green when you check the 'Overlay blocked links' box? |
Does it work if you create a file
|
Yes it works when I store data in a json file. Thank you very much!!!;) Thanks a lot! |
I'm not sure what's happening with your CSV file but with your JSON file there are a few other things you could try that might make it more suitable to being inserted into the database. The duplicate filter was being triggered because all of your fields are marked as variant and as such only the first item is being kept. You could solve this by adding a dummy non variant field that is different on each page.
Then in your
Please report back if it goes well or if you need more help. |
You're very welcome. |
My pleasure ^_^ |
how to store in mysql ? I am puzzle!! |
I am first use of portia,I had store data to json,but I dont kown how to store in mysql ..thanks. |
same problem? hello maoouyang have you understand how to do? |
I set same with you,but I alwals received : |
Now,my question is same with you. |
Hi, recently I have installed portia and it's quite good. But since I'm not very familiar with it, I have met a problem:
In my template I used variants to scrape the website and it worked well when I clicked continue browsing or turned to a similiar page. However, when I started running the portia spider in cmd, it could successfully scrape data yet failed to store the whole data scraped in the json file. I found that in the json file only the data in the page I annotated were stored.
I guess the problem is "WARNING: Dropped: Duplicate product scraped at http://bbs.chinadaily.com.cn/forum-83-206.html, first one was scraped at http://bbs.chinadaily.com.cn/forum-83-2.html" (it appeared in cmd when the spider was running), but I don't know how to solve the problem.
I hope someone can help me as soon as possible.
Thanks very much!
The text was updated successfully, but these errors were encountered: