Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement CharRefTokenizer #39

Merged
merged 6 commits into from
Apr 6, 2024
Merged

feat: implement CharRefTokenizer #39

merged 6 commits into from
Apr 6, 2024

Conversation

kkebo
Copy link
Owner

@kkebo kkebo commented Nov 26, 2023

Description

I've implemented CharRefTokenizer.

Benchmark Results

$ swift package --package-path Benchmarks benchmark baseline compare main
warning: 'package-benchmark': Jemalloc disabled through environment variable.
Building for debugging...
[9/9] Emitting module BenchmarkBoilerplateGenerator
Build of product 'BenchmarkBoilerplateGenerator' complete! (1.32s)
Building for debugging...
[1/1] Write swift-version--FC39A71C9A0968F.txt
Build of product 'BenchmarkTool' complete! (2.12s)
Build complete!
Building BenchmarkTool in release mode...
Building benchmark targets in release mode for benchmark run...
Building MyBenchmark

==================
Running Benchmarks
==================

100% [------------------------------------------------------------] ETA: 00:00:00 | MyBenchmark:TokenizerBenchmark

===================================================
Comparing results between 'main' and 'Current_run'
===================================================

Host 'Brown-rhinoceros-beetle' with 8 'aarch64' processors with 7 GB memory, running:
#1 SMP PREEMPT_DYNAMIC Sun Mar 24 19:44:17 UTC 2024

MyBenchmark
============================================================================================================================

----------------------------------------------------------------------------------------------------------------------------
TokenizerBenchmark metrics
----------------------------------------------------------------------------------------------------------------------------

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│         Time (wall clock) (μs) *         │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│                   main                   │      12 │      12 │      12 │      12 │      13 │      20 │      20 │      80 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│               Current_run                │      12 │      12 │      12 │      13 │      13 │      15 │      15 │      81 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │       0 │       0 │       0 │       1 │       0 │      -5 │      -5 │       1 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │       0 │       0 │       0 │      -8 │       0 │      25 │      25 │       1 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│          Throughput (# / s) (K)          │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│                   main                   │      82 │      82 │      81 │      81 │      79 │      49 │      49 │      80 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│               Current_run                │      82 │      81 │      81 │      79 │      79 │      65 │      65 │      81 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │       0 │      -1 │       0 │      -2 │       0 │      16 │      16 │       1 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │       0 │      -1 │       0 │      -2 │       0 │      33 │      33 │       1 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@kkebo kkebo marked this pull request as ready for review April 1, 2024 17:19
@kkebo kkebo self-assigned this Apr 1, 2024
@kkebo kkebo added the enhancement New feature or request label Apr 1, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Apr 3, 2024

Code Metrics Report

main (e55b299) #39 (3a31200) +/-
Coverage 88.9% 88.7% -0.1%
Details
  |           | main (e55b299) | #39 (3a31200) |  +/-  |
  |-----------|----------------|---------------|-------|
- | Coverage  |          88.9% |         88.7% | -0.1% |
  |   Files   |              7 |             8 |    +1 |
  |   Lines   |           1202 |          1258 |   +56 |
+ |   Covered |           1068 |          1116 |   +48 |

Code coverage of files in pull request scope (88.6% → 88.4%)

Files Coverage +/-
Sources/HTMLEntities/namedChars.swift 100.0% +100.0%
Sources/Tokenizer/CharRefTokenizer.swift 97.7% +0.0%
Sources/Tokenizer/Tokenizer.swift 86.3% -0.7%

Reported by octocov

Copy link
Owner Author

@kkebo kkebo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kkebo kkebo merged commit f1e72b1 into main Apr 6, 2024
2 checks passed
@kkebo kkebo deleted the named-char-ref branch April 6, 2024 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant