Improve metrics to be more analytics friendly #125

shah · 2021-07-09T20:17:36Z

@oscarotero and team, I have these suggestions for improving metrics:

Increase granularity
Allow "baggage" (in open telemetry terms) to be contributed by pages

Increasing granularity. Right now we have the following in metrics.json where the "name" has event details that need to parsed in order to be consumable for analytics:

{
    "name": "Load: /support/generate-static-build.sh",
    "entryType": "measure",
    "startTime": 1009.5581999999999,
    "duration": 0.1815000000000282,
    "detail": null
  }

Instead, I suggest something like this:

{
    "name": "Load: /home", // human-friendly, not useful for analytics
    "nature": { category: "load" }, // analytics-friendly, nature can be like a URL with depth /load/X, /preprocess, /render/Z, etc. and can be merged with metrics contributed by pages being rendered
    "target": { url: "/page/url", ...anything else we know about target... }, // analytics-friendly
    "entryType": "measure", // "measure" is the only entry I see here -- is there anything else?
    "startTime": 1007.8624, // analytics-friendly, 
    "duration": 2187.5929, // suggest removing "duration" and only including analytics-friendly "endTime"
    "detail": null // suggest renaming to "baggage" using OpenTelemetry terminology; and allow this to be contributed/merged from generated pages (through filter.metrics({ nature: {  }, baggage: {... anything ...}})
}

Allow metrics "contribution" by pages. Many of my pages run database calls or other expensive calls and I'd like to add my own metrics for those expensive items.

export default async function ( // for Lume
  page: lumeAide.RenderablePage,
  filters: lumeAide.Filters,
) {
  // do something expensive
  filter.metrics({ nature: {  }, baggage: {... anything ...}}); // I can call my own metrics?
}

The text was updated successfully, but these errors were encountered:

oscarotero · 2021-07-10T11:58:40Z

These metrics are generated with the standard Performance web interface, that allows to create marks and measure the time between these marks (or one mark to now) with performance.measure. I think it's better to use an already implemented and available interface instead of reiventing the wheel and create new system.

The performance object is already available in Deno as a global object, so you can use it at any point of your javascript/typescript code. site.metrics is just a wrapper of the performance object to be used internally by lume: See for example: https://github.com/lumeland/lume/blob/master/metrics.ts#L33

So, if you want to add additional metrics from your typescript pages, just need to do something like this:

export default async function (data, filters) {
    performance.mark("SQL query"); //Create a timing mark named "SQL query"
    const result = await makeExpensiveQuery();
    performance.measure("SQL query"); //Measure the timing from the mark "SQL query" to now.
}

Now, the measure of "SQL query" will be included in the list of your metrics. Note that lume only export the entries of type "measure" (not "mark") because marks are used only as the starting point of a measure. See this

Maybe we could include additional info in the details property but, due marks cannot have duplicated names, we have to contain all info in the entry name to avoid duplications.

shah · 2021-07-10T12:02:04Z

Perfect @oscarotero - I completely agree we should stay with Deno's strategy of being as close to web interface as possible. With this new information you've armed me with about metrics I'll do some studying and implement what I need on top. I'm trying to get the metrics to be useful on the server side through Prometheus and OpenTelemetry taxonomy and I'll let you know if I get stuck anywhere. Thanks!

shah · 2021-07-10T12:39:52Z

@oscarotero here's a quick suggestion that might get what we need. In metrics.ts if we change the code to this #getMarkName I think it would work fine:

  /**
   * Generate an unique mark name
   */
  #getMarkName(name: string, subject?: unknown, baggage?: Record<string, unknown>) {
    let markName: string | ({ name: string, subject?: unknown, baggage?: Record<string, unknown>}) = name;
    if(subject) {
        markName = { name, subject }
    }
    if(baggage) {
        markName = { name, subject, baggage }
    }
    return JSON.stringify(name);
  }

This way, if only a name is passed in we get:

{
    "name": "Load",
    "entryType": "measure",
    "startTime": 1009.5581999999999,
    "duration": 0.1815000000000282,
    "detail": null
  }

if other details are passed, we can get a parseable JSON:

{
    "name": "{ name: \"load\", subject: \"/support/generate-static-build.sh\", baggage: "...whatever..." }",
    "entryType": "measure",
    "startTime": 1009.5581999999999,
    "duration": 0.1815000000000282,
    "detail": null
  }

shah · 2021-07-10T12:42:01Z

@oscarotero if you'd rather put the extra information in detail that's fine too -- my only goal is to have is parseable without doing any string manipulation. Thanks!

oscarotero · 2021-07-10T15:29:01Z

I don't think the name is a good place to set info. This is the job of detail property.

shah · 2021-07-10T15:32:05Z

@oscarotero on second thought, I think we should do the following -- this will allow the Metrics to be completely turned into a user-supplied strategy and supplied through _config.ts. We can have a default implementation of the InstumentationSupplier which would be exactly what is available today but would allow those of us who need more sophistication to supply their own InstumentationSupplier implementation without bothering the Lume core.

/**
 * Interface to collect a single event's instrumentation
 */
export interface Instrumentable {
    readonly mark: () => Deno.PerformanceMark;
    readonly measure: () => Deno.PerformanceMeasure;
}

export type InstrumentOptions = Record<string, unknown>;

/**
 * Interface to collect all Lume events' instrumentation
 */
export interface InstumentationSupplier {
  readonly beforeBuild: (options?: InstrumentOptions) => Instrumentable;
  readonly copyAssets: (options?: InstrumentOptions) => Instrumentable;
  readonly copyAsset: (options: { from: string; to: string; }) => Instrumentable;
  readonly loadPages: (options?: InstrumentOptions) => Instrumentable;
  readonly loadPage: (options: { page: Page }) => Instrumentable;
  readonly renderPages: (options?: InstrumentOptions) => Instrumentable;
  readonly renderPage: (options: { page: Page }) => Instrumentable;
  readonly savePages: (options?: InstrumentOptions) => Instrumentable;
  readonly savePage: (options: { page: Page }) => Instrumentable;
  readonly afterBuild: (options?: InstrumentOptions) => Instrumentable;

  // allow unlimited others to be added by site owner?
  // readonly [x: string]: (options?: InstrumentOptions) => Instrumentable;
}

Once the above is available, then the calls would look like this:

   // this.metrics.copyAssets() returns Instrumentable with mark() already called
    const observe = this.metrics.copyAssets(); 
    for (const [from, to] of this.source.staticFiles) {
      await this.#copyStatic(from, to);
    }
    observe.measure();

Inside #copyStatic:

  async #copyStatic(from: string, to: string) {
   // this.metrics.copyAsset() returns Instrumentable with mark() already called
    const observe= this.metrics.copyAsset({ from, to }); // observation is strongly typed
    const pathFrom = this.src(from);
    const pathTo = this.dest(to);

    if (await exists(pathFrom)) {
      await ensureDir(dirname(pathTo));
      if (!this.options.quiet) {
        console.log(`🔥 ${normalizePath(to)} ${gray(from)}`);
      }
      return copy(pathFrom, pathTo, { overwrite: true });
    }
    observe.measure(); // the instrumentation supplier controls all aspects
  }

The current Metrics could become a default implementation like so:

/**
 * Class to collect and return performance metrics
 */
export default class Metrics implements InstumentationSupplier {
  site: Site;

  constructor(site: Site) {
    this.site = site;
  }

  beforeBuild(options?: InstrumentOptions): Instrumentable {
    const result: Instrumentable = {
        mark: () => performance.mark(this.#getMarkName("Build (entire site)"));
        measure: () => {
            const markName = this.#getMarkName("Build (entire site)");
            return performance.measure(markName, markName);
        },
        ...options // everything passed in is included as part of the result
    }
    result.mark();
    return result; // it's the responsibility of the caller to later call result.measure().
  }

  /// all other instrumentables ...

  copyAssets(options?: InstrumentOptions): Instrumentable {
    const result: Instrumentable = {
        mark: () => performance.mark(this.#getMarkName("Copy (all files)"));
        measure: () => {
            const markName = this.#getMarkName("Copy (all files)");
            return performance.measure(markName, markName);
        },
        ...options // everything passed in is included as part of the result
    }
    result.mark();
    return result; // it's the responsibility of the caller to later call result.measure().
  }

  copyAsset: (options: { from: string; to: string; }): Instrumentable {
    const result: Instrumentable = {
        mark: () => performance.mark(this.#getMarkName("Copy", options.from);
        measure: () => {
            const markName = this.#getMarkName("Copy", options.from);
            return performance.measure(markName, markName);
        },
        ...options // everything passed in is included as part of the result (caller can provide anything)
    }
    result.mark();
    return result; // it's the responsibility of the caller to later call result.measure().
  }

oscarotero · 2021-07-10T17:05:11Z

This is too verbose and not so scalable to my taste. It requires to create a method for every thing you want to measure.
But there are good ideas here, so I've modified a bit the metrics class:

instead of the start and end methods, there are only one method: start that returns the callback for the end:

const end = metrics.start("Metric name");
await expensive_operation();
end();

The second argument of start allows to insert any value that will be saved in detail:

const end = metrics.start("Load page", { page });
await loadPage(page);
end();

You can add extra details in the end() function too:

const end = metrics.start("Load page", { page });
const content = await loadPage(page);
end({ content });

The detail object has always the name key with the original name of the metric. This allows to have many metrics with the same name value in detail (but unique entry names):

for (const page as this.pages) {
  const end = metrics.start("Load page", { page });
  await loadPage(page);
  end();
}

// Now, you can get all "Load page" metrics:
const loadPageEntries = metrics.entries.filter((entry) => entry.detail.name === "Load page");

In addition to that, the json file exported is now prettier.

shah · 2021-07-10T17:16:21Z

Sure, if this is the best I can get I'll take it :-).

But, it's not as scalable for complex needs because the Lume core is providing the messages and subjects instead of a type-safe instrumentation supplier doing all messages and categorization. With the design I suggested, tossing in OpenTelemetry as the supplier would be trivial. With your (even much improved) design I am still forced to create an adapter in the middle instead of using a proper type-safe interface. In general I agree with you about reducing verbosity but complex sites like mine that have hundreds of thousands of pages with over a dozen developers need different flexibility in different page generators.

I appreciate you taking the input and improving it though, thanks! I'll use what you've created when the next release comes out and give you more feedback.

shah · 2021-07-10T17:35:39Z

@oscarotero in case you missed it please see this at the bottom of the InstrumentationSupplier interface:

  // allow unlimited others to be added by site owner?
  // readonly [x: string]: (options?: InstrumentOptions) => Instrumentable;

Also, you can make the code less verbose easily with the following -- maintains type-safety with easy flexibility:

export default class Metrics implements InstumentationSupplier {
  site: Site;

  constructor(site: Site) {
    this.site = site;
  }
 
  typicalInstrument(subject: string, options?: InstrumentOptions): Instrumentable {
    const result: Instrumentable = {
        mark: () => performance.mark(this.#getMarkName(subject));
        measure: () => {
            const markName = this.#getMarkName(subject);
            return performance.measure(markName, markName);
        },
        ...options // everything passed in is included as part of the result
    }
    result.mark();
    return result; // it's the responsibility of the caller to later call result.measure().
  }

  beforeBuild: (options?: InstrumentOptions) {
    return this.typicalInstrument("Build (entire site)");
  }

  copyAssets(options?: InstrumentOptions): Instrumentable {
    return this.typicalInstrument("Copy (all files)");
  }

  copyAsset: (options: { from: string; to: string; }): Instrumentable {
    return this.typicalInstrument("Copy (all files)", ...options);
  }

The main thing we want from the core is to have type-safe methods called through a proper interface for all core events. This way, we handle it with compile-time checks and not run-time surprises.

If you agree with proper type-safety in the core you would do this:

    const observe = this.metrics.copyAssets(); // type-safe and trappable by non-core implementation
    for (const [from, to] of this.source.staticFiles) {
      await this.#copyStatic(from, to);
    }
    observe.measure();

If you disagree with proper type-safety in the core you would do this:

    const observe = this.metrics.typicalInstrument("Copy (all files)");  // no limits, but not type-safe, still trappable by non-core implementation
    for (const [from, to] of this.source.staticFiles) {
      await this.#copyStatic(from, to);
    }
    observe.measure();

The nice thing with my suggestion is that it's a proper interface and I can trap any/all calls and do whatever I'd like. Whatever you decide in the end please make it an interface that can be replaced by users' own implementations.

oscarotero · 2021-07-10T17:37:54Z

I think the solution is very similar to your proposal. The only difference is that instead having one method for every thing to measure, we have just one method. I know that you're concerned about typings (so you can accept only some options for every measure, for example require a Page for every renderPage measurement), but in practice, is almost the same. In fact, there are other metrics that you have not taken into account, like processors and preprocessors. And if you want to measure database queries, for example, you should have to create new methods.

This is why I believe this way is more scalable. It works like events: there isn't an event class for every event type, there's just one event with the property type. This allows to have only two methods: addEventListener and dispatchEvent, instead of addBeforeBuildEvent, dispatchBeforeBuildEvent, addAfterBuildEvent, dispatchBeforeBuildEvent, and so on.

shah · 2021-07-10T17:45:40Z

@oscarotero I know it looks similar to my proposal but it's missing the main purpose: flexibility for analytics through multiple implementations. In your version, you're making the decision a priori about what to name a subject (which makes it type-unsafe and defeats typing) because you have concrete classes instead of an interface.

By putting in a minor abstraction (Instrumentable) you get unlimited flexibility with your primary goal of allowing everything complex to be done outside of the simplicity-focused core. Lume's initial TypeScript implementation is fantastic but there are many places there are concrete classes where interfaces should be (like Metrics). This may be the first place where you might take the TypeScript-centric approach to creating small interfaces and default simple implementations in Lume core with non-core implementations can do all the complex work on their own. Similar to your plugins concept.

This is mainly an architecture choice difference -- your style is good, solid, JavaScript concrete classes approach ... my suggestion is to use TypeScript-focused style with interfaces and multiple implementations. If it would be helpful to have a quick voice call to discuss this topic I'm happy to jump on a call. The reason I'm spending time on this discussion is that the architecture of Lume is fantastic but not type-safe yet and it would be great to increase type-safety as you progress to 1.0.

Thanks for hearing me out!

oscarotero · 2021-07-10T18:08:26Z

What you're proposing is that, instead of returning a function, returns an Instrument object with a measure function?
I mean:

// This is your proposal (the not type-safe option)
const observe = this.metrics.typicalInstrument("Copy (all files)");
for (const [from, to] of this.source.staticFiles) {
  await this.#copyStatic(from, to);
}
observe.measure();


// And this is my proposal
const end = this.metrics.start("Copy (all files)");
for (const [from, to] of this.source.staticFiles) {
  await this.#copyStatic(from, to);
}
end();

I see both are almost the same, but maybe I'm missing something. Maybe you're planning to expand the observe object with more functionalities?

My idea to make this extensible is that you could create your own class extending Metrics:

class DbMetrics extends Metrics {
    measureDatabase() {
        return this.start("Database");
    }
}

I guess what you're proposing is creating an interface that you can use without extending the base class, right?

class MyMetrics implements InstumentationSupplier {

Anyway, I'll release a new version in next days, but if you want to test it right now, just run lume upgrade --dev and this will install the last commit version.

btw, feel free to create a pull request if you want to work on this.

shah · 2021-07-10T18:18:38Z

Affirmative @oscarotero -- returning a function is not as extensible as returning an interface (like you said, I'm planning on expanding the observe object) :-).

Also, you're right that I would like to create an interface without extending the base class. Requiring inheritance (class X extends Y) is generally more brittle than using composition (implementing an interface with property functions). With interface implementations the core never worries about any base classes - only default implementations.

As Lume grows and more developers join the core (like I would love to!) the more base classes you have, the more brittle the system will be because base functionality can break in inherited classes. Adding more interfaces with default implementations in favor of base classes with inheritance is ideal - it would allow us to use Inversion of Control (IoC) in _config.ts for all complexity and leave the core super-simple.

oscarotero · 2021-10-08T13:50:02Z

I think this is already implemented, so I'm closing it.

oscarotero added a commit that referenced this issue Jul 10, 2021

improved metrics #125

6ac1564

oscarotero closed this as completed Oct 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve metrics to be more analytics friendly #125

Improve metrics to be more analytics friendly #125

shah commented Jul 9, 2021 •

edited

Loading

oscarotero commented Jul 10, 2021 •

edited

Loading

shah commented Jul 10, 2021

shah commented Jul 10, 2021

shah commented Jul 10, 2021

oscarotero commented Jul 10, 2021

shah commented Jul 10, 2021 •

edited

Loading

oscarotero commented Jul 10, 2021

shah commented Jul 10, 2021

shah commented Jul 10, 2021

oscarotero commented Jul 10, 2021

shah commented Jul 10, 2021 •

edited

Loading

oscarotero commented Jul 10, 2021 •

edited

Loading

shah commented Jul 10, 2021

oscarotero commented Oct 8, 2021

Improve metrics to be more analytics friendly #125

Improve metrics to be more analytics friendly #125

Comments

shah commented Jul 9, 2021 • edited Loading

oscarotero commented Jul 10, 2021 • edited Loading

shah commented Jul 10, 2021

shah commented Jul 10, 2021

shah commented Jul 10, 2021

oscarotero commented Jul 10, 2021

shah commented Jul 10, 2021 • edited Loading

oscarotero commented Jul 10, 2021

shah commented Jul 10, 2021

shah commented Jul 10, 2021

oscarotero commented Jul 10, 2021

shah commented Jul 10, 2021 • edited Loading

oscarotero commented Jul 10, 2021 • edited Loading

shah commented Jul 10, 2021

oscarotero commented Oct 8, 2021

shah commented Jul 9, 2021 •

edited

Loading

oscarotero commented Jul 10, 2021 •

edited

Loading

shah commented Jul 10, 2021 •

edited

Loading

shah commented Jul 10, 2021 •

edited

Loading

oscarotero commented Jul 10, 2021 •

edited

Loading