Skip to content

ljw1004/seaoflogs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sea of logs

"Sea of logs" is an interactive tool to visualize LSP traces and other logs.

The target audience for sea-of-logs is software developers who write language-service backends for Visual Studio Code or other editors. You'll often want to look at the trace of messages sent between VSCode and the language-service to see what happened, when, and why. VSCode lets you gather traces (Preferences > Settings and search for 'trace', then view it in the Output window, and copy/paste/save it to disk). But these traces are so voluminous you need to explore - filter out some messages, look at only some parts of the json payload for others, and tie together requests with their responses.

"Sea of logs" offers a way to explore logs which (1) is interactive, (2) lets you use the full expressivity of javascript to filter what you see.

Often you'll have logs from several sources, e.g. the VSCode extension you wrote for your language, the LSP transcript itself, and logs produced by your backend language server. You might want to use cross-log identifiers to tie together, say, a client request with the server's logs about how it handled it.

"Sea of logs" collates multiple logs by timestamp. It uses vertical space to indicate delays, and helps you filter by timerange.

Demo 1: exploring progress messages

  • Try it - demo.html. This exploration is to find out what all the LSP progress messages are about.
  • Technique: use filter: title == '$/progress' to look only at progress messages
  • Technique: use text: json.token + json.value.kind to extract key parts from the json payload
  • Technique: use color: json.token to see which messages are of interest
demo1.mp4

Demo 2: look at cancellations

  • Try it - demo.html. This exploration is to find out why we're getting failures.
  • Technique: click on a timestamp and "set start" to filter out earlier messages
  • Technique: click on a message to get "details", and use this as a reference while writing your filters.
  • Technique: use text: line to see the most informative first line of messages
  • Technique: use filter: title.includes('/did') to get didOpen and didChange messages
  • Technique: use text: (json?.textDocument ? filename + '#' + json.textDocument.version : '') to get filename and version if present
demo2.mp4

Demo 3: multiple logs

  • Try it - demo.html. This exploration is to see if the VSCode extension had any activity during the LSP message exchange.
  • Technique: use color: log to color by log
  • Technique: use text: log.includes('rust') ? line : title to render messages according to which log they're from
  • Technique: Left/Center/Right drop-downs to send one log to the left and the other to the right
demo3.mp4

What kind of logs can be parsed

"Sea of logs" aims to be relaxed about what it can accept, including at least LSP traces. Here are examples of three logs that can all be parsed. Incidentally, all three logs (VSCode extension, LSP trace and backend) come from the same session -- see that id "53906" is shared by different logs.

==> lsp.txt <==
[Trace - 9:43:06 PM] Sending request 'initialize - (53906)'.
Params: {
    "processId": 21953
}
[Trace - 9:43:06 PM] Received response 'initialize - (53906)' in 5ms.
Result: {
    "capabilities": null
}

==> backend.txt <==
[2021-10-26 21:43:06.511] [master][#53906] Heap size: 0.000590G
[2021-10-26 21:43:07.669] [worker-1] Parsing post_ss1.parsing: 0.101390

==> extension.txt <==
[DEBUG][10/26/2021, 9:43:06 PM]: Extension version: 0.2.792
[INFO][10/26/2021, 9:43:06 PM]: Using configuration {cargoRunner: null}

The rules for parsing a logfile into messages:

  • A message is defined as one or more lines where the first line starts with one or more tags in square brackets
  • time is a best effort to parse a timestamp out of those tags.
  • line is what comes after tags, and we make best effort to split into title and body
  • json is best effort to find a json object or array starting at the end of the first line or on the second line
  • id is best effort to extract an id from LSP Trace, or from a tag of the form [#id].
  • filename is best effort to extract a filename.

A limitation of LSP traces is that they lack dates; they have only times-of-day. Sea-of-logs will compensate by assuming that the LSP trace starts on the same day as another fully dated log, if present. Another limitation of LSP traces is that they lack milliseconds -- but at least sea-of-logs does parse and respect milliseconds from other logs.

Parsing is still a work in progress. If you have a reasonable log format that can't be parsed, we should figure out a generalization of your log format and change sea-of-logs to parse it.

How to distribute

"Sea of logs" can be used in four ways. It uses the standard MIT license to encourage redistribution.

Normal. Launch sea-of-logs at https://ljw1004.github.io/seaoflogs/, and click the Load button to explore your logs. As you explore, by interactively setting filter and text expressions, they are included in the URL. This way you can bookmark the URL to remember where you left off. NOTE: the URL does not include the content of logfiles; it only includes their filenames. When you visit your bookmark, you'll have to re-load whatever logfiles you want.

Local install. You can download the "seaoflogs/index.html" file for when you want to use the tool locally. It's entirely standalone and doesn't access the network.

Self-contained. You can package up a single self-contained html file that combines both the sea-of-logs tool and one or more logfiles. Indeed the demos on this page are all self-contained! You might keep that self-contained file on your hard disk, or you might place it on web-page (e.g. linked from an issue-tracker on github). If you share a self-contained bookmark with colleagues, they'll get both the content of the logs and your filters on it. Note that LSP traces usually contain the source code that the user was editing: only share self-contained files if you're allowed to share the user's source code.

# Make a self-contained html file by concatenating

$ cp seaoflogs/index.html mylog.html
$ tail -n +1 ~/logs/* | sed 's/-->/-- >/g' >> mylog.html

Self-hosted. If you have a webserver that can serve up dynamic content, you can lock down seaoflogs further. Your webserver will serve up a page like this. The idea is that your page can embed seaoflogs in an iframe whose sole permission is sandbox="allow-scripts", i.e. denying it even the ability to make same-origin fetch/XmlHttpRequest API calls other than the always-allowed request for stylesheet and script src. The reason this works is your server can hard-code initial values for seaoflogs (the initial query params, the log content, and the origin for the containing page), hence we don't even need same-origin access to initialize it, and hence we can deny it same-origin privileges. Sea-of-logs will do postMessage({nonce, params}, target) when it wants the page query params to be changed, sent to the seaoflogs_target and with the seaoflogs_nonce that were specified.

<!DOCTYPE html>
<html>
<head>
  <style type="text/css">html, body, iframe {width: 100%; height: 100%; margin: 0; overflow: hidden;}</style>
  <script>window.onmessage = (e) => if (e.data.nonce == [NONCE]) history.replaceState(null, null, window.location.href.replace(/\?.*$/, '?' + e.data.params));</script>
</head>
<body>
  <iframe srcdoc=[ESCAPED_FRAME_CONTENT] sandbox="allow-scripts" />
</body>
</html>

// frame content
<!DOCTYPE html>
<html>
<head>
  <meta name="seaoflogs_params" content=[INITIAL_PARAMS] />
  <meta name="seaoflogs_target" content=[PAGE_ORIGIN] />
  <meta name="seaoflogs_nonce" content=[NONCE] />
  <meta name="seaoflogs_logs" content=[LOGS] />
  <link rel="stylesheet" href="seaoflogs.css" />
  <script src="seaoflogs.js" />
</head>
<body />
</html>

Threat model

Scenario: my customers have sent me their confidential logfiles. I want to be able to analyze them but I have to be sure that I won't leak their data. I'm specifically not willing to upload the logs to some online log-visualizing website. In this case you can download a copy of sea-of-logs index.html to your hard disk, audit it to confirm that it makes no network access, open the local file in your web-browser, and load files into it.

Scenario: my customer sent me a logfile that I don't trust. I want to visualize it but me sure it won't harm me. Sea-of-logs runs in a sandboxed iframe, so even if something was in the logs and sea-of-logs failed to guard it properly, the damage it can wreak is limited by that sandbox. If you're running sea-of-logs on a local file on your hard disk, Chrome's "same-origin" rules will prevent it from accessing other files. If you're using the public sea-of-logs, the same rules will prevent it similarly.

Scenario: my customer sent me a logfile that I don't trust. Is it safe to construct and view a self-contained sea-of-logs? In addition to the above, this also opens up concerns whether the self-contained file "breaks out" of just being a log stored inside an html comment. The construction technique sed 's/-->/-- >/ prevents the log from breaking out of that html comment. Beyond that, there are the same risks as above.

Scenario: someome sent me a malicious sea-of-logs bookmark. Can I click it? Sea-of-logs bookmarks have the form <url>?query=<executable_code>, and the executable code is executed in the sea-of-logs page. Now if their bookmark takes you to an external site then the risk is no different from clicking on any random link to any random site, which we do all the time. If the bookmark directs you to a file:// url on your hard disk, and that file is seaoflogs.html, then the attacker's query string will be executed in the context of a local file on your hard disk. The same "same-origin" rules as above protect you.

Scenario: someone uploaded a malicious self-contained file. Can I click it? If they uploaded a self-contained sea-of-logs then they could have added malicious code, and you'd trust this just the same as trusting any random website. If they ask you to download the file, it's the same as downloading any random html from the internet and opening it locally.

Scenario: I want to host seaoflogs on my website. How can I be sure it won't be a vector for XSS attacks to other parts of my domain? You should use the "self-hosted" model described above. This way, you'll see that it runs entirely in a sandboxed iframe which lacks same-origin privileges to the rest of your domain.

Contributing

No idea. I've never yet build any projects where people were excited enough to contribute. If you're interested, go ahead! Ideas:

  • The code is particularly weak on html-layout at the moment. I tried to do it with flex-box in another branch but it became much slower on 4k+ logfiles. I don't know how to do better.
  • I wonder if the entire message-renderer and svg-renderer could become lazy too, like the background-renderer is at the moment? Then it could handle vastly larger log-files.
  • Are there other formats for LSP-traces that need to be accepted?
  • If you have a logfile that can't be parsed, you could create an issue and paste the log and I'll see how to ingest it. Or contribute ingestion code yourself. The date-parsing code is particularly un-general.

About

Interactive visualizer for LSP traces and other logfiles

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages