Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update schemas to the latest format #1010

Open
vchrombie opened this issue Oct 5, 2021 · 2 comments
Open

Update schemas to the latest format #1010

vchrombie opened this issue Oct 5, 2021 · 2 comments
Labels
good first issue Good issue for first-time contributors hacktoberfest

Comments

@vchrombie
Copy link
Member

ELK keeps a description for each enriched data used to build the KIbiter dashboards. Such descriptions are stored in the folder schema as CSV files. Over time, these descriptions have evolved and the current format is defined as a list of attributes that include the name, the type, whether the field can be aggregated and a description (e.g., schema/git.csv). Nevertheless, some schemas are still not aligned with the latest format. For instance, this is the case for:

The goal of this issue is to update the schemas to the latest format. In order to do so, given a data source (e.g., meetup, stackoverflow), micro-mordred[*] should be executed to collect and enrich the data. Then, the enriched documents should be inspected using the dev tools or the discover of Kibiter. For each attribute found in the enriched index, the corresponding schema should contain the name of the attribute, the type, whether the field can be aggregated and a description.

You can also use this script for automating the process and creating the schema file from the index: generate-es-index-schema.py

Note that some fields like the grimoire_creation_date, project, project_1, origin, etc. are shared across all enriched indexes and their descriptions can be taken from existing schemas.

[*] Details to execute micro-mordred for a given data source are available at: supported-data-sources.

Related issues

@vchrombie vchrombie added good first issue Good issue for first-time contributors hacktoberfest labels Oct 5, 2021
@prokan468
Copy link

I have worked with CSV files and python. Please do assign this issue to me and I shall provide you with the results.

@vchrombie
Copy link
Member Author

I have worked with CSV files and python. Please do assign this issue to me and I shall provide you with the results.

Hi @prokan468, thanks for showing interest. We cannot assign this issue since it is a long one. Feel free to choose the backend, follow the steps, update the schema and open the PR.

Please let me know if you need any help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good issue for first-time contributors hacktoberfest
Projects
None yet
Development

No branches or pull requests

2 participants