Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add capability to generate HTTP requests #25

Merged
merged 6 commits into from
Nov 1, 2022
Merged

Add capability to generate HTTP requests #25

merged 6 commits into from
Nov 1, 2022

Conversation

dilyand
Copy link
Contributor

@dilyand dilyand commented Oct 21, 2022

The purpose of this change is to allow us to generate random http requests that can be used in tests that send data to a live collector.

This is not about generating http requests as part of any specific library, so it isn't producing, eg akka.http.scaladsl.model.HttpRequest, org.http4s.Request or java.net.http.HttpRequest. Rather, it's a wrapper around all the elements required to produce any of the above http request flavours, depending on what technology is being used in the client code base.

Before these changes, the event generator could produce two types of output: collector payloads (thrift records, or what you would find in the good stream of a running collector) and enriched events (tsv or json, like what you would find in the good stream of a running enrich). These two are coupled, in that we can only generate enriched events from collector payloads, but not independently. This coupling is a good idea, because it allows us to write tests like:

val collectorPayloads = gen(collectorPayloads)
val enrichedEvents = gen(enrichedEvents)

enrich(collectorPayloads) mustMatch enrichedEvents

Initially, it seemed like a good idea to extend that coupling, so that collector payloads would in turn be created from http requests; and we could generate matching triple sets.

This turned out to be impractical though.

If the collector receives a request like:

curl -X POST localhost:12345/p1/p2

ie, one missing a body or a querystring, then it still writes out a thrift record into its good stream. We do want to include requests like these in any collector tests. However, we cannot ultimately generate enriched events from them. So if the generation of enriched events is tied to the http requests, we cannot have full matching between the two sets.

Additionally, we want to test the collector with requests to random paths, like p1/p2 in the example above. They would fail to be enriched if the collector payloads generated from these http requests are run through enrich. This creates another disparity.

We can limit the creation of http requests to only such instances that are guaranteed to pass enrichment. This is what we already do with collector payloads. However this seems to put too many restrictions on the sort of http requests we want to generate.

It might still be a good idea, worth revisiting down the line, to chain the three outputs together; but it's out of scope for this PR.

Another limitation that I hit is that it also turned out to be futile to add a toThrift method to the eventgen.tracker.HttpRequest wrapper class. It would have allowed us to run tests like:

val httpRequests = gen(httpRequests)
val collectorPayloads = httpRequests.map(_.toThrift)

httpRequests.foreach(send)

good.getRecords mustMatch collectorPayloads

But we create the collector_timestamp in the collector itself. So any timestamps generated randomly as part of the call to toThrift wouldn't match the actual data in the good stream.

@dilyand
Copy link
Contributor Author

dilyand commented Oct 21, 2022

The failed CI / test check is because we're using JDK 8 in the github action. I'll change it before merging.

Copy link
Contributor

@voropaevp voropaevp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job with http request details generation. Please integarte this into the collector payload.

private def genHead: Gen[Method.Head] = Gen.oneOf(fixedApis, genApi(0), genApi(1)).map(Method.Head)
}

object Headers {
Copy link
Contributor

@voropaevp voropaevp Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates all of the structures from CollectorContext. This is a richer generator and it should be integrated into the CollectorPayload generation. Old stuff could be removed.

This change will make adding an httpSink easier, which would be required later.

CollectorPayload would remain a top level generator, as it could include multiple httpRequests when it is batching. Check the enrich.SdkEvent, it would need some refactoring adding HttpRequestFromColPayload and change to the top-level generator (getPair) to generate 3 entities -

  • collector payload
  • list of enriched events
  • http request(s? I'm no sure if that is one-to-one or one-to-many, depends if collector had caching)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the Headers from the HttpRequest is an easy adjustment. I'll do that.


I believe there's a challenge with the idea of CollectorPayload being the top-level generator, from which both HttpRequest and SdkEvent are generated.

A couple of examples:

1.) We need to be able to generate http requests with either querystring or body, or neither, or both. All of these are valid options for which the collector has defined behaviours that we want to test.

Let's say we make CollectorPayload.payload an Option[List[Body]] and we add CollectorPayload.querystring, which is also optional.

We'll have to be able to generate come CollectorPayloads with both of these set to None, so that we can then derive an http request where both are missing. But then we won't be able to use these same CollectorPayloads to also generate SdkEvent, because they don't contain any actual event data.

2.) We need to be able to generate http requests with random paths, ie Api(random, random).

We can create CollectorPayload with random Api and we can use it to derive http requests. But again, we won't be able to create SdkEvent from it, because there's no enrich adaptor that handles the random/random path.

There are three options we have to solve the above:

  • introduce bad rows, so that we can create failed events from the collector payloads described in the examples;
  • introduce some mapping from 'incomplete' (missing body and qs) / 'faulty' (having random path) collector payload to 'correct' one before it is used to create an SdkEvent
  • introduce some additional config, so we can generate CollectorPayload -> HttpRequest or CollectorPayload -> SdkEvent, but not CollectorPayload -> (HttpRequest, SdkEvent).

I think all three could be valid. But at this time, we don't actually need to make the decision which one is best. Because we already have everything we need to generate data and write tests.


I think also that httpSink is a bit pointless. What would the output look like? The data that the event generator creates is not a genuine http request that can be read and sent with curl or some other tool. It's just a wrapper around some building blocks for real http requests.

Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Collector payload already handles both GET and POST style requests. It will probably need some extra code to integrate with the new httpRequest, but it shouldn't be much.

    • It makes sense to make HTTP request the bottom-level entity. The reason I suggested the opposite is it would be somewhat less code to refactor. But considering bad events http-request should be at the bottom.

    • Bad events are a separate issue, which probably deserves a separate PR. They could be generated by the mapping functions ColPayloadFromHttpRequest and SdkEventFromColPayload. It would require adding an AST to distinguish event types and refactoring the sinks and a bit of generation logic. To limit the scope sink could just drop the bad events, in the first release.
      It would require a fair amount of config changes.

  2. httpSink would generate genuine HTTP requests. The existing fs2 sink interface will work nicely with http4s client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@voropaevp Sorry, I don't understand. How would that solve the problems I listed in my previous comment about 'empty' requests (with no body and no qs) and about randomised paths?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments ensure that the overall architecture of the application doesn't drift without very good reason.

  • Http client is implemented as Sink
  • Generated events and http requests generated in pairs, so they could be verified.

Both problems you are solving would have to be solved with new code. The difference is integrating it into the existing event generation flow or building a new one.

Copy link
Contributor

@voropaevp voropaevp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The goal of these tests is to ensure that we can't release a new version of the event generator where:
- the default application.conf cannot be parsed because of a typo that corrupts the HOCON;
- the config parsing function accepts corrupted HOCON files.
`change_form` was mistakenly added as a context but it's actually an event.
@dilyand dilyand merged commit 2adcb9c into main Nov 1, 2022
@colmsnowplow colmsnowplow deleted the feature/http branch August 15, 2023 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants