-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New search chain that doesnt use serpapi #199
Comments
@hwchase17 From my initial testing setting up a custom google search API through GCP produces nearly identical scraped results when compared with serpapi. For example, here is a comparison of the scraped text produced: https://www.diffchecker.com/L4oCHMUA A big benefit is the Custom Search API allows for 10000 requests per day compared to the only 100 a month offered by serpapi. I will work on comparing the Custom Bing Search API offered by Microsoft with serpapi (Bing) next. I think the browserless scraping with requests is also a limitation as the service used in the WebGPT example allows for only 1000 instances a month. Will have to do more research into this as well. I think Selenium Webdriver may be a decent alternative. Thank you for all of your work! |
Related to #199 Motivation: SerpAPI is very expensive and scrapping can cause problems. Solution: Implemented the new Google search API done through the Programmable search engine. It is a bit longer to set up but worth it as it gives 10,000 search queries per day for free. Instructions: 1. Install google-api-python-client - If you don't already have a Google account, sign up. - If you have never created a Google APIs Console project, read the Managing Projects page and create a project in the Google API Console. - Install the library using `pip install google-api-python-client` The current version of the library is 2.70.0 at this time 2. To create an API key: - Navigate to the APIs & Services→Credentials panel in Cloud Console. - Select Create credentials, then select API key from the drop-down menu. - The API key created dialog box displays your newly created key. - You now have an `API_KEY` 3. Setup Custom Search Engine so you can search the entire web - Create a custom search engine in this link. - In Sites to search, add any valid URL (i.e. www.stackoverflow.com). - That’s all you have to fill up, the rest doesn’t matter. In the left-side menu, click Edit search engine → {your search engine name} → Setup Set Search the entire web to ON. Remove the URL you added from the list of Sites to search. - Under Search engine ID you’ll find the `search-engine-ID`. 4. Enable the Custom Search API - Navigate to the APIs & Services→Dashboard panel in Cloud Console. - Click Enable APIs and Services. - Search for Custom Search API and click on it. - Click Enable. URL for it: https://console.cloud.google.com/apis/library/customsearch.googleapis .com Adapted from: Instructions adapated from https://stackoverflow.com/questions/37083058/programmatically-searching-google-in-python-using-custom-search - [X] Implementation - [X] Tests, Inline Docs, Formatting - [ ] Add it to `load_tools`. - [ ] Improve external documentation I think it still needs some general work here. Has been a while since I've coded in Python. I tried my best to follow the steps provided in the resources. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Thanks for your work, @hwchase17! Let me know if you need SerpApi search credits for research and development. |
thanks for offering @ilyazub ! very generous. big fan of serpapi so i'm personally fine paying atm (not doing a ton of research) closing this issue as we have both a google search and bing alternative |
too few free trials, too expensive
The text was updated successfully, but these errors were encountered: