Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregate metrics from other discovery nodes #1266

Merged
merged 16 commits into from
Mar 10, 2021
Merged

Conversation

sddioulde
Copy link
Contributor

Description

We want each discovery node to give us aggregated metrics across all discovery nodes, so that obtaining the metrics does not depend on all nodes being up.

Tests

  • Spun up locally, hit the new endpoints to make sure they return coherent data.
  • Played around with the cron intervals and checked the responses and the logs
  • Ran db migration locally to make sure reads and writes are happening
    ... also working on adding unit tests

Clients for the metrics data (e.g. the dashboard) will need to update the endpoints they make requests to in order to use the new aggregated metrics.

Copy link
Member

@raymondjacobson raymondjacobson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a first pass, will review again in more depth on monday. This is so cool.

discovery-provider/src/api/v1/metrics.py Outdated Show resolved Hide resolved
discovery-provider/src/api/v1/metrics.py Outdated Show resolved Hide resolved
discovery-provider/src/models.py Outdated Show resolved Hide resolved
discovery-provider/src/queries/get_route_metrics.py Outdated Show resolved Hide resolved
discovery-provider/src/utils/redis_metrics.py Outdated Show resolved Hide resolved
discovery-provider/src/tasks/index_metrics.py Outdated Show resolved Hide resolved
discovery-provider/src/tasks/index_metrics.py Show resolved Hide resolved
discovery-provider/src/utils/helpers.py Show resolved Hide resolved
discovery-provider/src/utils/redis_metrics.py Show resolved Hide resolved
@sddioulde sddioulde force-pushed the saliou-aud-metrics branch 2 times, most recently from 5c7c9a3 to 7a602c8 Compare March 1, 2021 17:41
Copy link
Contributor

@dmanjunath dmanjunath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall structure looks solid. just had a few questions and comments

discovery-provider/src/utils/redis_metrics.py Show resolved Hide resolved


def upgrade():
op.create_table('daily_unique_users_metrics',
Copy link
Contributor

@dmanjunath dmanjunath Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should try to distinguish the old per node tables from the new all discovery node tables. if "aggregate" is the wording we're using for all the discovery nodes together, i'd suggest putting that in the table names and stuff too. cause otherwise i'm sure we'll get confused between app_name_metrics and daily_app_name_metrics. conversely we could rename the existing app_name_metrics and route_metrics as app_name_metrics_single_node or something like that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%

start_time = int(start_time_obj.timestamp())
new_route_metrics, new_app_metrics = get_metrics(node, start_time)

logger.info(f"received route metrics: {new_route_metrics}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will probably be a massive object to print, do we want to output this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

num_discovery_providers = sp_factory_inst.functions.getTotalServiceTypeProviders(discovery_node_service_type).call()
logger.info(f"number of discovery providers: {num_discovery_providers}")
service_infos = [sp_factory_inst.functions.getServiceEndpointInfo(discovery_node_service_type, i).call() \
for i in range(1, num_discovery_providers + 1)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just an optimization, don't need to add this, but if performance is/was an issue we can use threadpoolexecutors to parallelize this like https://github.com/AudiusProject/audius-protocol/blob/master/discovery-provider/src/tasks/index_network_peers.py#L66-L85

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think doing this sync is probably better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

update_app_metrics_count(monthly_app_metrics, historical_metrics['apps']['monthly'])

logger.info("synchronizing historical metrics")
logger.info(f"daily historical route metrics to update: {daily_route_metrics}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing about printing large objects here, do we want this here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these shouldn't be super large though, at least for the route metrics. daily should be about 30 records, each with a unique and total count; monthly would be (12*num_total_years) records. for apps it's harder to predict the number because the number of apps used for a given day/month could be different

but we can surely pull it for now until we see a real need for it

discovery-provider/src/utils/helpers.py Show resolved Hide resolved
discovery-provider/src/utils/helpers.py Show resolved Hide resolved
@sddioulde sddioulde force-pushed the saliou-aud-metrics branch 3 times, most recently from 0223483 to 0ae8036 Compare March 3, 2021 19:58
Copy link
Contributor

@dmanjunath dmanjunath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @sddioulde! I love how all of this is net new too

all_other_nodes = []

# fetch all discovery nodes info in parallel
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 😃

@raymondjacobson
Copy link
Member

mergeeeeeeeeeeeeeeeeeeeee!!!!!!

@sddioulde sddioulde merged commit ceaebcc into master Mar 10, 2021
@sddioulde sddioulde deleted the saliou-aud-metrics branch March 10, 2021 23:24
@sddioulde
Copy link
Contributor Author

Merged :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants