-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any way to send HTTP POST requests? #106
Comments
I am also looking for the ability to follow links or parse the next page. Sometimes the first url is not what you are looking for (for example if you want to parse the first result of a search page and not the search page itself) |
There is a solution here: api = Api(url)
app = api.server.app
@app.route('/post_page/')
def post_method():
res = requests.post(url, data) # You need to analysis the ajax post request of source site.
return item.parse(res.text) |
This example could help you. https://github.com/gaojiuli/toapi/blob/master/examples/hackernews_page.py |
@gaojiuli Thanks! |
@gaojiuli Where does
but they are empty. |
由于toapi内置的 我们需要自行添加flask路由来实现功能 这里给出一个比较详细的例子 假设我需要通过post方法来得到这个 url 的数据,并且通过toapi的方式来解析的
from toapi import Item, XPath
class Search(Item):
'''
从搜索的界面解析出
书名 id 链接 简介
'''
title = XPath('//h3/a/text()')
book_id = XPath('//h3/a/@href')
url = XPath('//h3/a/@href')
content = XPath('//p[2]/text()')
def clean_title(self, title):
return ''.join(title)
def clean_book_id(self, book_id):
return book_id.split('-')[1]
def clean_url(self, url):
return url[:url.find('?')]
class Meta:
source = XPath('//li[@class="pbw"]')
# 这里的route留空,防止重复注册路由
route = {}
from toapi import Api
from items.search import Search
from settings import MySettings
import json
import requests
api = Api('',settings=MySettings)
api.register(Search)
@api.server.app.route('/search/<keyword>')
def search_page(keyword):
'''
91bay新书论坛
搜索功能
'''
data = {
'searchsel': 'forum',
'mod': 'forum',
'srchtype': 'title',
'srchtxt': keyword,
}
r = requests.post(
'http://91baby.mama.cn/search.php?searchsubmit=yes', data)
r.encoding = 'utf8'
html = r.text
results = {}
items = [Search]
# 通过toapi的方法对网页进行解析
for item in items:
parsed_item = api.parse_item(html, item)
results[item.__name__] = parsed_item
# 返回json
return api.server.app.response_class(
response=json.dumps(results, ensure_ascii=False),
status=200,
mimetype='application/json'
)
if __name__ == '__main__':
api.serve() 这样我们就可以通过访问http://127.0.0.1:5000/search/keyword 来解析post数据 这个方法由于没有得到toapi的支持 |
Hi @Ehco1996
|
In working with toapi I came across a scenario where the web page had an HTML table that was paginated.
Clicking on "next page" would issue an ajax post request to fetch the next set of records in the data set.
Is there anyway to accomplish this with toapi?
The text was updated successfully, but these errors were encountered: