Skip to content

A tool for search engine based competitive analysis. AWS Amplify project with many web crawlers for different search engines, and a simple web frontend for displaying historical results in Next.js

License

ExcitingTheory/amplify-spiders-v1

Repository files navigation

Amplify Spiders v1

Amplify Spiders v1

Amplify Spiders v1 is an AWS Amplify project that hosts a Next.js site with real-time data and custom Lambda handlers that include Lambda containers and tensorflow.js. This project provides crawlers for several different search engines for competitive analysis only.

Getting Started

To get started with Amplify Spiders v1, follow these steps:

  1. Clone the repository to your local machine.

  2. Install the necessary dependencies by running npm install.

  3. Set up your AWS Amplify environment by following the instructions in the amplify/README.md file.

  4. Run the project locally by running npm run dev.

  5. Build and push the container CI/CD Needs some work still: see here for more information Pre-push Hook This may need to be disabled on the first deploy?

  6. Deploy the first time rename the hook, amplify push, put the hooks name back.

  7. Update the following secrets with amplify update function:

    • googleKey,
    • googleCx,
    • foursquareClientId,
    • foursquareClientSecret,
    • facebookAccessToken,
    • infogroupApiKey,
    • yellowpagesKey,
    • yelpApiToken,
    • foursquareApiKey
  8. Deploy the project to the cloud with the hook enabled by running ECR_REPO_NAME="" ACCOUNT_ID="" amplify push. Where ECR_REPO_NAME is the repo that CDK generates.

Features and Functionality

Amplify Spiders v1 domain rank demo

Amplify Spiders v1 includes the following features and functionality:

  • Next.js site
  • Real-time data
  • Custom Lambda handlers
  • Lambda containers
  • Tensorflow.js for Universal Sentence Encoder to compare search results to search query.
  • Crawler for Google custom search engine
  • Crawler for Citysearch
  • Crawler for Yelp
  • Crawler for Yellow Pages
  • Crawler for FourSquare
  • Can create a new user
  • Can login as a user
  • Can create a domain to monitor
  • Can view all domains
  • Can view domain details
  • Can view historical rankings for a domain and a search engine as a line chart
  • Detects if the domain is in the first page search results for a search engine

Roadmap

Amplify Spiders v1 is currently in development. The following features and functionality are planned for future releases:

  • Finish the main site menubar (can login, but not logout yet) IN PROGRESS
  • Remove lambda contianers by treeshaking this library to reduce bundle size.
  • Crawler for Facebook Business: Need to get the app approved by Facebook for the demo site
  • Crawler for Bing?
  • Crawler for Yahoo?
  • CI/CD for Container Lambda handlers?
  • Find good sources of regional statistical and demohgraphic data for cross referencing with search results?

Contributing

See the CONTRIBUTING.md file for information on how to contribute to Amplify Spiders v1.

License

Amplify Spiders v1 is licensed under the MIT License. See LICENSE.txt for more information.

Code of Conduct

Amplify Spiders v1 has adopted the Contributor Covenant Code of Conduct. See CODE_OF_CONDUCT.md for more information.

About

A tool for search engine based competitive analysis. AWS Amplify project with many web crawlers for different search engines, and a simple web frontend for displaying historical results in Next.js

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks