# First Bug

### Finding our bug

We have identified this [bug in dagster](https://github.com/dagster-io/dagster/issues/17707), as a good first issue to tackle.  Don't worry, it's plenty hard.

One of the tricky things is understanding Dagster's Architecture.  Doing that can help us to identify exactly at what level the bug is occurring.  For example, a basic understanding of Dagster's architecture is:

* Javascript/React frontend making requests to
* Python API (backend)

If we can see the bug occurring in the data the Python API sends back, then we can largely ignore the entire React codebase.  However, if the Python API is sending back the correct data, then this means that the problem is in the React front end.  

So what is the architecture of Dagster?

* React FrontEnd
* Fetching data largely (if not entirely) via GraphQL using a library called Apollo
* Python Backend serves as a GraphQL API [relevant code here](https://github.com/dagster-io/dagster/tree/master/python_modules/dagster-graphql/dagster_graphql)
* And this connects to either a SQLite or Postgres db -  some [relevant code here](https://github.com/dagster-io/dagster/tree/master/python_modules/dagster/dagster/_core/storage), via SQlAlchemy

So if we identify that the bug is occurring on the database level (which it likely is), we may also be able to avoid learning GraphQL (but maybe not).

> We can think of GraphQL as an API for the front end to query our Python code.

And, so then we can narrow down our problem.

### General Debugging Principal

So a principal part of debugging is determining at what level the bug is occurring: the frontend, the api (graphql), the database (or really the data getting into the database).

And really the first step is reproducing the bug, and seeing it in the UI -- see [this doc](https://docs.google.com/document/d/1BYXKX36rTblWS3eGkferfejLgTs3Wcsw5mqHQhDXYn8/edit) for reproducing our bug.

Then essentially, we can explore the different levels that the bug occurs in.

### Going from frontend to backend

It's nice to understand how our React front end is using our Python backend.  That way we can (hopefully) see which part of our Python code is relevant.  In other words, we are tracing our bug backwards -- from the frontend (where we see it) down to the level that is leading to it. 

So that we can avoid navigating through a lot of JS, a good technique is to go to the relevant UI, and then look at the Network tab to see what API calls are being made.  

Our bug relates to Runs, and in reproducing the bug, we have seen that the run *does* show up in the Runs page, but does not show up when we click on a specific job and then from there click on runs.  

> **Note** Just this one observation (without any coding) gives us valuable information.  It means that the run is being stored in the db, but maybe

* it is not being tagged with the right job id, 
* or it could be query to that job's runs is not being called correctly, 
* or that the query function to the db is not defined properly 

If it's working on one page but not another, we are likely querying the API differently on one page than another.  So let's see the different query calls by exploring the Network tab.

1. Graphql from the Runs page

<img src="./graph-ql.png">

> Above we identify the relevant graphql call by looking at the response and seeing that it aligns with the relevant data in the UI.  

And from there, we can click on Payload to see what query generated this.  So below we see it was the `RunsRootQuery`.

<img src="./payload-request.png">

> Where does the `RunsRootQuery` come from?  Well we can search for it in our ui-code in the Javascript via the [AutomaterializeRunHistoryTable.tsx](https://github.com/dagster-io/dagster/blob/master/js_modules/dagster-ui/packages/ui-core/src/assets/auto-materialization/AutomaterializeRunHistoryTable.tsx) here.

> A simple click on Finder in VS code, and searching the codebase for `RunsRootQuery` is how we got there.

#### 2. From the Jobs Table

Ok, so is there a different query in the Jobs table that may explain the difference?  
Well we can see there are different queries, if, in the dagster UI we go to `Overview > Jobs`, and then click on a particular job.  

From there we again need to sort through the different GraphQL calls, via the Network tab to find the relevant one.  We see that there is a `LatestRunTagQuery`, and a `PipelineRunsRootQuery`.  



<img src="./latest-tag-query.png">

> The `PipelineRunsRootQuery` seems the most relevant one.

<img src="./pipeline-runs-root.png">

We can see that it filters for our pipelineName.  So again, maybe this query is wrong, or our run could be improperly tagged in the database (ie missing the `pipelineName` or `JobName`).

### GraphQL With Python 

> Just casually understand this, the next section may be more promising.  Maybe just run through these resources for 30 minutes, as the next section may be more promising.

Remember, that our Python codebase is going to act as the graphql server.  And unfortunately, we may not know GraphQL too well to be able to navigate the codebase and debug.  

You can see there is a `dagster_graphql`, [module](https://github.com/dagster-io/dagster/tree/master/python_modules/dagster-graphql) that it uses Python, GraphQL and the Graphene library to try to (likely) make queries to the backend.

You can learn about Graphene [here](https://www.apollographql.com/blog/complete-api-guide), and maybe [here](https://docs.graphene-python.org/en/latest/quickstart/).  And you can learn a bit about graphql (with Javascript) [here](https://www.apollographql.com/docs/kotlin/v2/tutorial/03-write-your-first-query/). 

> From here, it's useful to explore the dagster graphql [module](https://github.com/dagster-io/dagster/tree/master/python_modules/dagster-graphql), maybe the tests as well, and the runs file to get a sense of how this works.  (**Maybe skip this**)

### Storage (May be Key)

While we may not understand GraphQL too well, we are more familiar with SQLAlchemy, and lucky for us, those who [explored this bug](https://github.com/dagster-io/dagster/issues/17707) believe that is the issue.  They identified [Dagster storage]([Dagster Storage](https://github.com/dagster-io/dagster/tree/master/python_modules/dagster/dagster/_core/storage)), [specifically here](https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_core/storage/dagster_run.py#L382-L384) as something to mimic the logic of.

We can see that [Dagster Storage](https://github.com/dagster-io/dagster/tree/master/python_modules/dagster/dagster/_core/storage) does use SQLAlchemy (at least Alembic), which is closely related.  So perhaps this can allow us to see how Jobs are saved in the db, and then this can show us why they are not being saved through the CLI.

The closer we can get to the problem the better.


* We can learn a little more about storage [here](https://docs.dagster.io/deployment/dagster-instance).
* We still should explore where the [sqlalchemy library](https://docs.sqlalchemy.org/en/13/orm/extensions/declarative/basic_use.html) is potentially used.
* Is there a way to use our SQLAlchemy knowledge to connect to the db?

Then from there, we can perhaps understand (1) how a dagster run is stored in our db and then (2) why a job, and specifically a job tagged to that run is (likely) not being stored in the db.

### Re-Explore Suggested Files


* From here, may be worth going back to the files identified in the bug, and to better understand how the _core/storage works, and ultimately saves to the database.  
* https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_core/storage/dagster_run.py#L382-L384 

* It is likely using a relational backend database (sqlite or postgres), so if we can see that it the job is not being stored there, we need to identify a fix.

The main question here is, how does dagster store information about a run, in the database, and is that same thing occurring from the CLI (especially when creating a job).

And from there to try to understand how this can related to the `_cli/job.py` file.
* https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_cli/job.py

### That's it

Ok, that's the most promising components identified so far.  But if you would like to see things from a JS perspective (as I spent some time doing) there are resources below.

And remember in identifying our bug (with breakpoints, potential fixes), we'll want to have our Dagster dev environment setup as specified in step 3 [in the google doc](https://docs.google.com/document/d/1BYXKX36rTblWS3eGkferfejLgTs3Wcsw5mqHQhDXYn8/edit).

Also, yes, it likley took a long time to get seemingly no-where, but you are ramping up on a very complicated codebase, and that takes a bit of time.

### Graphql playground in JS

If you want more practice with Graphql, and to get a sense of how it works with the front end, we can also get a sense of how graphql works, by moving through this tutorial [ApolloGraphQL](https://www.apollographql.com/docs/kotlin/v2/tutorial/03-write-your-first-query/)

> To move through the tutorial, one error was that the apollo server was not being installed per the docs, but was able to fix by running 
> `npm install apollo-server`

* Remember that Dagster has a webserver playground that you can interact with.  See [Dagster GraphQL API Docs](https://docs.dagster.io/concepts/webserver/graphql) here.

So if we can reproduce the bug, by hitting the graphql server, that is our bug.  

To see the bug, we can go back to the queries that were being fetched from the client front end:

* PipelineRootsQuery, filter: pipelinename: diamond
* LatestRunTagQuery, runsfilter: pipelinename: diamond

Ultimately, we can probably find those queries by searching the [UI Core](https://github.com/dagster-io/dagster/tree/master/js_modules/dagster-ui/packages/ui-core) for them, and then reproducing in our playground.   

### Javascript Frontend Calls (Probably A RabbitHole)

Potential Backend Calls

1. UseQuery
We can also see [via the docs](https://www.apollographql.com/docs/react/data/queries/), that queries are made via the `useQuery` method, and we can find these calls in the dagster codebase.

But only in two places:
* useAssetGraphData.tsx
* QueryRefresh.tsx

All other places in the codebase, likely reference the functions in these files.

2. Client.query

* Look at the `RunFilterInput.tsx` file, which has calls to client.query, which also is retrieving data

### Explore how apollo connects to backend

* Even with the exploration above, there is still the question of how Apollo connects to our Python API or a backend database.

* Can see some relevant documentation here with [Apollo Data Sources](https://www.apollographql.com/tutorials/fullstack-quickstart/03-connecting-to-data-sources)

If we look at the `AppProvider.tsx` file we can see some urls listed in that file.

### Explore GraphQl

* The next step is to confirm our bug on the graphql level.  

In other words, now that we have seen the bug show up on the UI level, let's perform the same steps, and confirm that it shows up at the graphql level.  

* If it does, we can likely be done searching through react code, and instead change our focus to the python backend.  And when we do, we now know the related graphql call  

We have a couple prospects:
 * PipelineRootsQuery, `filter: pipelinename: diamond`
 * LatestRunTagQuery, `runsfilter: pipelinename: diamond`

* [Dagster GraphQL API Docs](https://docs.dagster.io/concepts/webserver/graphql)