Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update "Loading sample data" documentation #22446

Closed
alexfrancoeur opened this issue Aug 28, 2018 · 13 comments
Closed

Update "Loading sample data" documentation #22446

alexfrancoeur opened this issue Aug 28, 2018 · 13 comments
Assignees
Labels
Team:Docs Team:Visualizations Visualization editors, elastic-charts and infrastructure v7.6.0

Comments

@alexfrancoeur
Copy link

Currently, we are using Logstash for the getting started documentation in Kibana, https://www.elastic.co/guide/en/kibana/current/tutorial-load-dataset.html.

Since the majority of the add data tutorials are using beats, it probably makes more sense to use the filebeat instead of logstash. I wonder if we should update the documents to use a specific tutorial rather than manually configure logstash mappings? For instance, system logs will always provide back data.

@gchaps let's sync live or discuss async in this issue. How can we improve this documentation / experience and highlight filebeat?

@alexfrancoeur
Copy link
Author

Notes from Sep 5

  • Update the data set to be more recent
  • Change logstash- to filebeat-
  • Discussed replacing sample data sets in advanced tutorial with add data tutorials

@dedemorton
Copy link
Contributor

dedemorton commented Sep 7, 2018

I wouldn't change the index name to filebeat- unless you are planning to use Filebeat to ingest the example file. Since the file is already in JSON format, I don't see an advantage to using Filebeat. In fact, you'll make the steps more complicated. For now, I'd recommend updating the data set to make it more current and using logs- for the index prefix so that we avoid mentioning logstash.

If you want to show a tutorial that uses Filebeat, it would make sense to feed Filebeat an unstructured log format, but that's going to require more configuration and obscure the goal of the tutorial.

Another thought: the problem with using the Add Data tutorial is that the data you capture might not be robust enough to support the visualizations you want to show. For example, when I use the Filebeat system module out-of-the-box on MacOS, my system logs do not contain any IP information, so there's no GeoIP data sent to Elasticsearch. Without GeoIP data, I can't build Coordinate Maps.

@alexfrancoeur
Copy link
Author

Thanks for your feedback and input @dedemorton! I added some comments / questions inline.

I wouldn't change the index name to filebeat- unless you are planning to use Filebeat to ingest the example file.

At the moment we're not using Logstash to ingest the data but the index is still logstash-. Wouldn't it be the same for Filebeat / filebeat-? I'd also be fine with logs-.

If you want to show a tutorial that uses Filebeat, it would make sense to feed Filebeat an unstructured log format, but that's going to require more configuration and obscure the goal of the tutorial.

I don't necessarily need to use Filebeat for a tutorial, but thought it would be a good idea to take advantage of them in the UI to start ingesting realtime data. I was toying around with this idea and love the feedback. My thoughts were that we'd use System metrics, these are usually pretty noisy and you can install directly on your local machine to begin getting data.

screen shot 2018-09-11 at 9 41 18 am

Though we'd need a disclaimer on how to stop it if you want to shut it off. That would allow us to combine the two tutorials (possibly) to showcase Sample Data with Metricbeat / Filebeat.

Another thought: the problem with using the Add Data tutorial is that the data you capture might not be robust enough to support the visualizations you want to show. For example, when I use the Filebeat system module out-of-the-box on MacOS, my system logs do not contain any IP information, so there's no GeoIP data sent to Elasticsearch. Without GeoIP data, I can't build Coordinate Maps.

This is true, even with System metrics we likely won't get GeoIP data. Though our new web logs sample data may help here for this specific use case (#22276) in 6.5. We'll also be adding an eCommerce data set.

I'm struggling with what to do here. On one hand, I'm not sure how important it is to have a full ingest step in the Getting Started with Kibana documentation. We could probably use sample data for a majority of what we need to showcase. On the other hand, it's really the only way to show the process of creating an index pattern. Not that using a Beat tutorial will help here either, but it does surface the ease of ingesting data in a turn-key fashion.

@dedemorton
Copy link
Contributor

@alexfrancoeur I agree that loading data through the bulk API is a bit contrived. It would be fine to show the Add Data functionality, assuming your sample data is robust enough to support all the visualizations that you want to show (you won't get all the data you need simply by running the system module). Of course, using a new sample set means that you'll need to redo all the visualizations, which will be significantly more work than simply renaming the index and updating the existing sample data.

@debadair
Copy link
Contributor

@gchaps and I talked about this recently. My take is that the sample data sets (and really the entire tutorial) are obsolete. When it was created, there was no end-to-end stack tutorial and no built in way to ingest some data. The bulk API was the most expedient way to get some canned data into Elasticsearch so we could show how Kibana works.

I don't think we can just stick a bandaid on what's there--it's not just a matter of just getting rid of the Logstash reference & tweaking the sample data. To me, the primary goal of the Kibana tutorial is to give a tour of Kibana and walk users through how to interact with data in an Elasticsearch index. If we can accomplish that with the built-in sample data/add data functionality, we should. Or, perhaps, decide that we have different goals for the doc tutorial given the current reality.

Whatever we decide to do, it should fit into our overall Getting Started story.

@dedemorton
Copy link
Contributor

@debadair Agreed. I was misunderstanding the goal of that section. I thought it was meant to teach people how to build dashboards. I agree that a full walkthrough needs to present a more comprehensive experience. The system module might be sufficient for a tutorial like that (depending on the types of visualizations you want to showcase). Again...it's back to the question of whether the data supports the story that you want to tell. @gchaps Let me know if you need help with the ingest parts.

@alexfrancoeur
Copy link
Author

With Sample Data, we an interactive way to work with pre-built dashboards. With that sample data set, we can also build our own. @AlonaNadler just re-recorded the Kibana Getting Started webinar (https://www.elastic.co/webinars/getting-started-kibana), maybe we can align here? It's a quick 30 min flow that touches upon Cloud and uses sample data for not only describing aspects of Kibana but actually building your own Dashboard. I agree that the getting started with Kibana tutorial should be focused on Kibana, not the ingestion portion. That being said, we need a way to tie in index patterns. They are a core part of the getting started experience if you are not using the add data tutorials.

@gchaps
Copy link
Contributor

gchaps commented Sep 14, 2018

@debadair and I spoke about how to revise the Getting Started. Our thoughts are that it should provide a pointer to Alona's Getting Started webinar, plus these three tutorials:

  • Explore Kibana using the Flights dashboard. This is the tutorial that was added for 6.4.

  • Use an Add Data tutorial to ingest your data. This is a new tutorial that will cover:
    -- Adding data
    -- Creating visualizations
    -- Creating and sharing a dashboard
    -- Creating a report

  • Starting to visualize your data. This is also a new tutorial, but it will not be based on a particular data set. It will address @alexfrancoeur's concern about including index patterns in the GS experience. Here is the outline:
    -- Finding indices
    -- Setting up index patterns to match your indices
    -- Selecting index patterns
    -- Managing index patterns

Alona's webinar also includes walkthroughs for watcher and security. We could also add separate tutorials for those concepts.

@alexfrancoeur Let me know what you think. If we go this route, we discussed earlier using the System log for the Add Data tutorial. Do you have a recommendation for what visualizations to create?

@alexfrancoeur
Copy link
Author

@gchaps I like this direction and can certainly help with some visualizations. System metrics or System logs might both be good examples, both of which have dashboards that are prepackaged out of the box. Metrics may have more "exciting" visualizations. It'd be very easy to reproduce some of the visualizations from that dashboard. I think for 6.5, we'll also need to consider Kibana Spaces impacting the getting started experience. By default, there will only be one, so we may not need to address it but thought I'd point it out.

@rashmivkulkarni
Copy link
Contributor

@gchaps - can you take a look at the comments and close it out/address it as you seem fit. Thanks

@LeeDr LeeDr added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Oct 10, 2018
@schersh
Copy link
Contributor

schersh commented Mar 18, 2019

Update:

  • Create new getting started tutorial using system metrics module
  • Use the system data tutorial in Kibana to show users how to ingest data using Metricbeat
  • Showcase prepackage dashboards to show interactions with visualizations (TSVB)
  • Make connections to other solutions like Infrastructure or Logs

@gchaps
Copy link
Contributor

gchaps commented Dec 6, 2019

Notes from Dec 9 meeting re: Getting Started update.

Revised GS outline

  1. Set context
  2. Use File upload to get data into Kibana
  3. Ask a specific question
  4. Discover data
    • Discuss different ways to search, but focus on KQL
    • Show how to add and remove fields
    • Show time filter
    • Show how to filter results
  5. Create 2 to 3 visualizations
  6. Create a dashboard
  7. What do do next--give users help on when to use what visualization:
    • Time series data--walkthrough of TSVB
    • Infographic or more polished presentation--walkthrough of Canvas
    • Geo spatial data--walkthrough of Maps

@KOTungseth
Copy link
Contributor

We are going in a different direction for the Getting started section, so we are closing this issue. If you want to discuss this issue, please feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Docs Team:Visualizations Visualization editors, elastic-charts and infrastructure v7.6.0
Projects
None yet
Development

No branches or pull requests

8 participants