Skip to content

Commit

Permalink
Add logger backend which sends data to UI
Browse files Browse the repository at this point in the history
  • Loading branch information
oltarasenko committed Dec 15, 2020
1 parent efab4f9 commit 52e8308
Show file tree
Hide file tree
Showing 2 changed files with 60 additions and 41 deletions.
56 changes: 15 additions & 41 deletions documentation/experimental_ui.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,7 @@
# Experimental UI

We believe that web scraping is a process. It might seem easy to extract first
data items, however we believe that the data delivery requires a bit more efforts or
a process which supports it!

Our aim is to provide you with the following services:

1. Schedule (start and stop) your spiders on a cloud
2. View running jobs (performance based analysis)
3. View and validate scraped items for quality assurance and data analysis purposes.
4. View individual items and compare them with the actual website.
If you are interested in our attempts to make crawling more predictable,
have a glance on: https://github.com/oltarasenko/crawly_ui

## Setting it up

Expand All @@ -32,37 +24,19 @@ For setting up erlang-node-discovery
]
```

## Testing it locally with a docker-compose

CrawlyUI ships with a docker compose which brings up UI, worker and database
nodes, so everything is ready for testing with just one command.

In order to try it:
1. clone crawly_ui repo: `git clone git@github.com:oltarasenko/crawly_ui.git`
2. build ui and worker nodes: `docker-compose build`
3. apply migrations: `docker-compose run ui bash -c "/crawlyui/bin/ec eval \"CrawlyUI.ReleaseTasks.migrate\""`
4. run it all: `docker-compose up`

## Live demo

Live demo is available as well. However it might be a bit unstable due to our continuous release process.
Please give it a try and let us know what do you think
## Setting up logger

[Live Demo](http://18.216.221.122/)
You can send logs to CrawlyUI as well. In order to do that you have
to add the following (changing your node name of course) to your config:
```
# tell logger to load a LoggerFileBackend processes
config :logger,
backends: [
:console,
{Crawly.Loggers.SendToUiBackend, :send_log_to_ui}
],
level: :debug
## Items browser
config :logger, :send_log_to_ui, destination: {:"ui@127.0.0.1", CrawlyUI, :store_log}
One of the cool features of the CrawlyUI is items browser which allows comparing
extracted data with a target website loaded in the IFRAME. However, as sites may block iframes, a workaround browser extension may be used to ignore X-Frame headers.
For example:
[Chrome extension](https://chrome.google.com/webstore/detail/ignore-x-frame-headers/gleekbfjekiniecknbkamfmkohkpodhe)

## Gallery

![Main Page](documentation/assets/main_page.png?raw=true)

![Items browser](documentation/assets/items_page.png?raw=true)

![Items browser search](documentation/assets/item_with_filters.png?raw=true)

![Items browser](documentation/assets/item_preview_example.png?raw=true)
```
45 changes: 45 additions & 0 deletions lib/crawly/loggers/send_to_ui_backend.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
defmodule Crawly.Loggers.SendToUiBackend do
# TODO: Write doc
# Initialize the configuration
def init({__MODULE__, name}) do
{:ok, configure(name, [])}
end

def handle_call({:configure, opts}, %{name: name} = state) do
{:ok, :ok, configure(name, opts, state)}
end

# Handle the flush event
def handle_event(:flush, state) do
{:ok, state}
end

# Handle any log messages that are sent across
def handle_event(
{_level, _group_leader, {Logger, message, _timestamp, metadata}},
%{destination: {node, module, function}} = state
) do
case Keyword.get(metadata, :crawl_id, nil) do
nil ->
:ignore

crawl_id ->
:rpc.cast(node, module, function, [crawl_id, message])
end

{:ok, state}
end

defp configure(name, []) do
case Application.get_env(:logger, name, []) do
[] ->
raise "Destination was not configured"

config ->
%{name: name, destination: Keyword.get(config, :destination)}
end
end

# We don't support any re-configuration so far
defp configure(_name, _opts, state), do: state
end

0 comments on commit 52e8308

Please sign in to comment.