Skip to content

Make GCS ftest use shared fixture#1718

Merged
artem-shelkovnikov merged 4 commits intomainfrom
artem/make-gcs-use-shared-fixture
Oct 13, 2023
Merged

Make GCS ftest use shared fixture#1718
artem-shelkovnikov merged 4 commits intomainfrom
artem/make-gcs-use-shared-fixture

Conversation

@artem-shelkovnikov
Copy link
Copy Markdown
Member

@artem-shelkovnikov artem-shelkovnikov commented Oct 3, 2023

Part of https://github.com/elastic/enterprise-search-team/issues/3397

This PR changes the fixture for the respected content source to make use of shared test setup using WeightedFakeProvider class.

This class takes care of generating large fake data with certain distribution, for example:

# In 65% cases generate small files
# In 20% cases generate medium size files
# In 10% cases generate large files
# in 5% cases generate huge files
fake_provider = WeightedFakeProvider(weights=[0.65, 0.2, 0.1, 0.05])

fake_provider.get_html() # <---- gets HTML of size depending on distribution
fake_provider.get_text() # <---- gets text of size depending on distribution

# Important difference
# get_text() returns huge amount of text that can be ingested, for example 2MB payload
# get_html() returns a huge payload that has too little text, for example for 2MB html it's around 1KB of text to text download/subextraction of rich content better

The final goal of the PR is to have more comparable benchmarks in our nightly functional tests of the amount of memory or cpu that is needed to run the connector.

Checklists

Pre-Review Checklist

  • this PR has a meaningful title
  • this PR links to all relevant github issues that it fixes or partially addresses
  • if there is no GH issue, please create it. Each PR should have a link to an issue
  • this PR has a thorough description
  • Tested the changes locally

Related Pull Requests

@artem-shelkovnikov artem-shelkovnikov enabled auto-merge (squash) October 13, 2023 15:09
@artem-shelkovnikov artem-shelkovnikov enabled auto-merge (squash) October 13, 2023 15:09
@artem-shelkovnikov artem-shelkovnikov merged commit 8032185 into main Oct 13, 2023
@artem-shelkovnikov artem-shelkovnikov deleted the artem/make-gcs-use-shared-fixture branch October 13, 2023 15:24
github-actions Bot pushed a commit that referenced this pull request Oct 13, 2023
@github-actions
Copy link
Copy Markdown

💚 Backport PR(s) successfully created

Status Branch Result
8.11 #1793

This backport PR will be merged automatically after passing CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants