Skip to content
This repository has been archived by the owner on Oct 15, 2023. It is now read-only.

YunaBraska/paginator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paginator

Paginator to get html documents with JS support

Build Maintainable Coverage Issues Commit Dependencies License Central Tag Javadoc Size Label Label

Requirements

  • min java 8
  • chrome installed on the machine

Docker image

Configurations

ENV VARIABLE DEFAULT DESCRIPTION
SERVER_PORT 8089 Server port
N/A 10000 HTML pages cache limit
N/A 10800000ms HTML pages cache life time

Endpoints

METHOD URL REQUEST BODY RETURN BODY Description
GET/PUT /pages url,
page_cache_ms* [optional]
Get html page from url
GET/PUT /pages/elements url,
Map<queryId, cssQuery>,
page_cache_ms* [optional]
Map<queryId,
List<Elements>>
Get specific html elements
GET/PUT /pages url,
content,
page_cache_ms* [optional]
Manual add html page to cache
GET/PUT /pages/statistics size,
maxLifeTime,
sizeLimit
Get cache statistics

* page_cache_ms is optional - it does not overwrite the previous value at the second call.

Examples

Get elements from HTML page

  • Request: GET http://localhost:8089/pages/elements
  • Body:
{
  "url": "parse.example.com",
  "css_queries": {
    "form_text": "form p"
  }
}
  • Response
{
  "form_text": [
    {
      "tag": "P",
      "text": "Some example text here.",
      "selector": "html > body > div > form > p:nth-child(1)",
      "attributes": {
      },
      "children": [
      ]
    }
  ]
}

Cache custom html pages

  • Request: POST http://localhost:8089/pages
  • Body:
{
  "url": "my.own.example.com",
  "content": "<!doctype html><html><head><title>Example Domain</title></head><body><div><h1>Example page</h1></div></body></html>"
}
  • Request: POST http://localhost:8089/pages
  • Body:
{
  "url": "my.own.example.com",
  "content": "<!doctype html><html><head><title>Example Domain</title></head><body><div><h1>Example page</h1></div></body></html>"
}

Docker build image example

  • Create jar file: mvn clean -Dmaven.test.skip=true package
  • Build local image docker build -t paginator .
  • Docker image tag latest for repo: docker tag "$(whoami)/paginator" SOME_REPO_PATH/paginator:latest;
  • Docker image push to repo: docker push SOME_REPO_PATH/paginator:latest

TODO

  • Async page call implementation [remove synchronised]
  • Endpoint to clear cache
  • configurable default cache limits
    ////((((((((((((((((((((((((((((((* **         
    //////////////////////////////////* */(/.      
    //////////////////////////////////* */////*    
    //////////////////////////////////* *////////. 
    //////////////////////////////////*            
    ///////......................,////////////////.
    //////////////////////////////////////////////.
    ///////...............................,///////.
    ///////******************************/////////.
    //////////////////////////////////////////////.
    //////*.           PAGINATOR          ,///////.
    //////////////////////////////////////////////.
    **********************************************.
    **********************************************.
    ********,....*********************************.
    ********,    *********************************.
            .,***********,    ,*******************.
             ,,,,,,,,,,,,,    ,*,,,,,      .,,,,,,.
             ,,,,,,,,,    ,,,,,,,,,,,      .,,,,,,.
      ................    .......,,,.   .......... 
      ,,,,,,.                    ,,,.  .,,,.       
      ,,,,,,.       ....     ,,,.                  
                    ,,,.     ,,,.                  
                ....                               
                ....                               
                    ....