Skip to content
This repository has been archived by the owner on Mar 9, 2024. It is now read-only.

📝OLX Notification is a script created using Python and AWS Lambda that allows you to receive daily emails with new announcements on the olx.pl portal from various categories

DEENUU1/olx-notification

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


OLX Notification

Get your daily offers on email.

Report Bug · Request Feature

aws

excel data

About The Project

OLX Notification allows you to collect data from the olx.pl portal on a daily basis and then send this data to the indicated address.

The whole thing is based on the public API provided by OLX and AWS Lambda, which allows automatic code execution at a given time.

The data is sent by e-mail as an Excel file.

Key Features

  1. Scraping data from olx.pl
  2. Serverless architecture - AWS Lambda
  3. Saving scraped data to excel file
  4. Send data by an e-email

Built With

  • Python
    • Pandas
    • Openpyxl
    • Requests
  • AWS Lambda

Installation

Tutorials

https://www.youtube.com/watch?v=o3s4VqlMsT8&t=228s - AWS CDK configuration

https://aws.plainenglish.io/lambda-layer-how-to-create-them-python-version-bc1e027c5fea - How to add python library by using .zip format 
  1. Clone git repository
git clone https://github.com/DEENUU1/science.git
  1. Install all requirements
pip install -r requirements.txt
  1. Add url from which you want to scrape data
olx_notification/olx.py

URLS_TO_SCRAPE = {
    "Łódź mieszkania wynajem": "https://www.olx.pl/api/v1/offers/?offset=40&limit=40&category_id=15&sort_by=created_at%3Adesc&filter_refiners=spell_checker&sl=18ae25cfa80x3938008f",
    "Next your": "https://www.olx.pl/"

}
  • To get url go to olx.pl
  • Choose category for example - Nieruchomości
  • Choose what you need for example (use all filters you need):
    • Mieszakania
    • Wynajem
    • Warszawa
  • Click the right button on your mouse and open devtools
  • Go to Network and refresh the page F5
  • Scroll to the bottom and go to 2 page
  • Scroll to the bottom again and filter the results in Network by the Type
  • Find the object that looks like this

olx page

- Click on this object and copy link and paste to the `olx.py` file inside `URLS_TO_SCRAPE` dictionary
  1. Download AWS CLI
https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html 
  1. Configure AWS CLI
aws configure 
  1. Install aws-cdk (npm is required)
npm install -g aws-cdk@latest
  1. Bootstrap
cdk bootstrap aws://<your_account_id>/<your_location>
  1. Deploy code to AWS Lambda
cdk deploy 
  1. AWS Lambda configuration
  • Go to Lambda/Functions/OlxNotificationStack....
  • Click on Configuration
    • Go to General configuration and click edit

    • Change Timeout (I set this on 1 min)

    • Go to Environment variables and add (Key: value)

      To get an SMTP_PASSWORD you need to go to the Security -> 2 step verification -> Password to application -> Copy generated password

      • FROM_EMAIL: your gmail email
      • SMTP_PASSWORD: email to your gmail account
      • SMTP_USERNAME: your gmail email
      • TO_MAIL: your email
  • Now back to your AWS Lambda function and scroll to the bottom to get Layers section
  • Click Add a layer
    mkdir proj  (create a new folder)
    cd .\proj\  (go to this folder)
    python -m venv venv  (create python env)
    venv\Scripts\activate  (activate python env)
    mkdir python  (inside proj directory create a folder called python)
    cd python
    pip install openpyxl -t .  (install package)
    • After this few commands go to this directory and save python folder to .zip file
    • Now go to Layers and click Create layer
      • Add some name, choose a .zip file (upload python.zip), choose x86_64 and arm64 and choose the correct version of a python in Runtimes
    • Go back to your AWS Lambda function
    • Click Add a layer and select Custom layers
    • Select the layer that you created and click add
  1. Use EventBridge
  • To run this script everyday whe to use EventBridge
  • Go to your function in AWS Lambda and choose Add trigger
  • Then select EventBridge (CloudWatch Events)
  • Choose Create a new rule
    • Add some name
    • Choose Schedule expression
    • In Schedule expression add this - cron(0 14 ? * MON-SAT *)
    • Click Add

Thanks to this configuration the script is gonna run every day at 14 pm

License

See LICENSE.txt for more information.

About

📝OLX Notification is a script created using Python and AWS Lambda that allows you to receive daily emails with new announcements on the olx.pl portal from various categories

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published