New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analytics #1800
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
irevoire
force-pushed
the
segment
branch
4 times, most recently
from
October 13, 2021 18:56
1f858aa
to
5688f56
Compare
irevoire
force-pushed
the
segment
branch
3 times, most recently
from
October 26, 2021 11:17
8e102dd
to
9f419d6
Compare
|
MarinPostma
suggested changes
Oct 27, 2021
|
bors try |
tryBuild failed: |
MarinPostma
suggested changes
Oct 29, 2021
MarinPostma
force-pushed
the
segment
branch
2 times, most recently
from
October 29, 2021 15:35
498ea4b
to
519093e
Compare
MarinPostma
approved these changes
Oct 29, 2021
bors merge |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #1784
Implements this spec
Anonymous Analytics Policy
1. Functional Specification
I. Summary
This specification describes an exhaustive list of anonymous metrics collected by the MeiliSearch binary. It also describes the tools we use for this collection and how we identify a Meilisearch instance.
II. Motivation
At MeiliSearch, our vision is to provide an easy-to-use search solution that meets the essential needs of our users. At all times, we strive to understand our users better and meet their expectations in the best possible way.
Although we can gather needs and understand our users through several channels such as Github, Slack, surveys, interviews or roadmap votes, we realize that this is not enough to have a complete view of MeiliSearch usage and features adoption. By cross-referencing our product discovery phases with aggregated quantitative data, we want to make the product much better than what it is today. Our decision-making will be taken a step further to make a product that users love.
III. Explanation
General Data Protection Regulation (GDPR)
The metrics collected are non-sensitive, non-personal and do not identify an individual or a group of individuals using MeiliSearch. The data collected is secured and anonymized. We do not collect any data from the values stored in the documents.
We, the MeiliSearch team, provide an email address so that users can request the removal of their data: privacy@meilisearch.com.
Thanks to the unique identifier generated for their MeiliSearch installation (
Instance uuid
when launching MeiliSearch), we can remove the corresponding data from all the tools we describe below. Any questions regarding the management of the data collected can be sent to the email address as well.Tools
Segment
The collected data is sent to Segment. Segment is a platform for data collection and provides data management tools.
Amplitude
Amplitude is a tool for graphing and highlighting collected data. Segment feeds Amplitude so that we can build visualizations according to our needs.
The
identify
call we send every hour:System Configuration
system
This property allows us to gather essential information to better understand on which type of machine MeiliSearch is used. This allows us to better advise users on the machines to choose according to their data volume and their use-cases.
system
=> Never changes but still sent every hoursKb
, eg: 33604210Kb
, eg: 336042103MEILI_SERVER_PROVIDER
env var. This is also filled by our providers deploy scripts. e.g. GCP cloud-config.yaml, eg: gcpMeiliSearch Configuration
context.app.version
: MeiliSearch version, eg: 0.23.0env
:production
/development
, eg:production
has_snapshot
: Does the MeiliSearch instance has snapshot activated, eg:true
MeiliSearch Statistics
stats
stats
database_size
: Size of indexed data. Expressed inKb
, eg: 180230indexes_number
: Number of indexes, eg: 2documents_number
: Number of indexed documents, eg: 165847start_since_days
: How many days ago was the instance launched?, eg: 328Documents Searched POST
: The Documents Searched event is sent once an hour. The event's properties are averaged over all search operations during that time so as not to track everything and generate unnecessary noise.user-agent
: Represents all the user-agents encountered on this endpoint during one hour, eg:["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
requests
99th_response_time
: The maximum latency, in ms, for the fastest 99% of requests, eg:57ms
total_suceeded
: The total number of succeeded search requests, eg:3456
total_failed
: The total number of failed search requests, eg:24
total_received
: The total number of received search requests, eg:3480
sort
with_geoPoint
: Does the built-in sort rule _geoPoint rule has been used?, eg:true
/false
avg_criteria_number
: The average number of sort criteria among all the requests containing the sort parameter. "sort": [] equals to 0 while not sending sort does not influence the average, eg:2
filter
with_geoRadius
: Does the built-in filter rule _geoRadius has been used?, eg:true
/false
avg_criteria_number
: The average number of filter criteria among all the requests containing the filter parameter. "filter": [] equals to 0 while not sending filter does not influence the average, eg:4
most_used_syntax
: The most used filter syntax among all the requests containing the requests containing the filter parameter.string
/array
/mixed
,mixed
q
avg_terms_number
: The average number of terms for theq
parameter among all requests, eg:5
pagination
:max_limit
: The maximum limit encountered among all requests, eg:20
max_offset
: The maxium offset encountered among all requests, eg:1000
Documents Searched GET
: The Documents Searched event is sent once an hour. The event's properties are averaged over all search operations during that time so as not to track everything and generate unnecessary noise.user-agent
: Represents all the user-agents encountered on this endpoint during one hour, eg:["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
requests
99th_response_time
: The maximum latency, in ms, for the fastest 99% of requests, eg:57ms
total_suceeded
: The total number of succeeded search requests, eg:3456
total_failed
: The total number of failed search requests, eg:24
total_received
: The total number of received search requests, eg:3480
sort
with_geoPoint
: Does the built-in sort rule _geoPoint rule has been used?, eg:true
/false
avg_criteria_number
: The average number of sort criteria among all the requests containing the sort parameter. "sort": [] equals to 0 while not sending sort does not influence the average, eg:2
filter
with_geoRadius
: Does the built-in filter rule _geoRadius has been used?, eg:true
/false
avg_criteria_number
: The average number of filter criteria among all the requests containing the filter parameter. "filter": [] equals to 0 while not sending filter does not influence the average, eg:4
most_used_syntax
: The most used filter syntax among all the requests containing the requests containing the filter parameter.string
/array
/mixed
,mixed
q
avg_terms_number
: The average number of terms for theq
parameter among all requests, eg:5
pagination
:max_limit
: The maximum limit encountered among all requests, eg:20
max_offset
: The maxium offset encountered among all requests, eg:1000
Index Created
user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]primary_key
: The name of the field used as primary key if set, otherwisenull
, eg:id
Index Updated
user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]primary_key
: The name of the field used as primary key if set, otherwisenull
, eg:id
Documents Added
: The Documents Added event is sent once an hour. The event's properties are averaged over all POST /documents additions operations during that time to not track everything and generate unnecessary noise.user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]payload_type
: Represents all thepayload_type
encountered on this endpoint during one hour, eg: [text/csv
]primary_key
: The name of the field used as primary key if set, otherwisenull
, eg:id
index_creation
: Does an index creation happened, eg:false
Documents Updated
: The Documents Added event is sent once an hour. The event's properties are averaged over all PUT /documents additions operations during that time to not track everything and generate unnecessary noise.user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]payload_type
: Represents all thepayload_type
encountered on this endpoint during one hour, eg: [application/json
]primary_key
: The name of the field used as primary key if set, otherwisenull
, eg:id
index_creation
: Does an index creation happened, eg:false
user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]ranking_rules
sort_position
: Position of thesort
ranking rule if any, otherwisenull
, eg:5
sortable_attributes
total
: Number of sortable attributes, eg:3
has_geo
: Indicate if_geo
is set as a sortable attribute, eg:false
filterable_attributes
total
: Number of filterable attributes, eg:3
has_geo
: Indicate if_geo
is set as a filterable attribute, eg:false
RankingRules Updated
user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]sort_position
: Position of thesort
ranking rule if any, otherwisenull
, eg:5
SortableAttributes Updated
user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]total
: Number of sortable attributes, eg:3
has_geo
: Indicate if_geo
is set as a sortable attribute, eg:false
FilterableAttributes Updated
user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]total
: Number of filterable attributes, eg:3
has_geo
: Indicate if_geo
is set as a filterable attribute, eg:false
user-agent
: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]Ensure the user-id file is well saved and loaded with:
the dumps
the snapshots
Ensure the CLI uuid only show if analytics are activate at launch or already exists (=even if meilisearch was launched without analytics)