Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
119 commits
Select commit Hold shift + click to select a range
fa89122
migrations: add codeinsights migrations (based on migrations/codeintel)
Jan 13, 2021
6014933
internal/db: generate separate schema.md files for codeintel and fron…
Jan 13, 2021
3d7323e
internal/db/dbutil: add codeinsights migrations
Jan 13, 2021
ec41897
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 13, 2021
608d656
dev: run codeinsights-db (TimescaleDB)
Jan 15, 2021
0bf010e
docker-images: add codeinsights-db (re-tag TimescaleDB)
Jan 15, 2021
85d359e
add TODOs about places needing updates for timescaledb
Jan 16, 2021
728e4c0
running personal notes (to be moved to proper dev docs later)
Jan 16, 2021
3477821
add intial DB schema
Jan 16, 2021
1c695d4
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 16, 2021
111da5b
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 19, 2021
fc03157
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 19, 2021
7f68e88
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 19, 2021
66bf94b
rename histogram_events -> gauge_events; document table
Jan 19, 2021
3f745c5
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 19, 2021
25482c9
dev/drop-entire-local-database-and-redis.sh - make it work for codein…
Jan 19, 2021
1f755f9
README: update
Jan 19, 2021
314dcfd
DB schema take 2
Jan 19, 2021
7f2412b
note metadata filtering
Jan 19, 2021
8f73de7
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 19, 2021
e1528b6
scratch the surface of aggregation
Jan 19, 2021
a3e36bb
graphqlbackend: stub resolvers out
Jan 20, 2021
0dafd6e
initial GraphQL schema
Jan 20, 2021
8e1e370
enterprise: stub resolvers out
Jan 20, 2021
8db17be
graphql schema: fix typo
Jan 20, 2021
74c18f3
graphql schema: mark as experimental
Jan 21, 2021
e44cfdd
insights: add GraphQL backend scaffolding
Jan 21, 2021
3ecec6c
remove code added from merge conflict
Jan 21, 2021
1658590
Merge branch 'sg/codeinsights-graphql' into codeinsights
Jan 21, 2021
7423a50
insights: add GraphQL backend scaffolding
Jan 21, 2021
d11f384
Merge branch 'sg/codeinsights-graphql' into codeinsights
Jan 21, 2021
f807ff5
insights: add GraphQL backend scaffolding
Jan 21, 2021
0adfb1f
Merge branch 'sg/codeinsights-graphql' into codeinsights
Jan 21, 2021
ab84a46
graphql schema: fix merge conflict
Jan 21, 2021
0df0f1a
fix test
Jan 21, 2021
9a62c9d
update test NewSchema calls
Jan 21, 2021
6ebbcbb
Merge branch 'sg/codeinsights-graphql' into codeinsights
Jan 21, 2021
22f9804
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 21, 2021
f92b811
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 22, 2021
9391670
run database migrations
Jan 22, 2021
ecbe238
dev: correct codeinsights-db port; add connection info
Jan 22, 2021
1b89ccf
run database migrations
Jan 22, 2021
96e54f5
internal/db: do not run TimescaleDB migrations against singleton data…
Jan 22, 2021
7430a9d
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 22, 2021
ad13779
store WIP
Jan 25, 2021
0e0b816
WIP
Jan 25, 2021
600a856
internal/db: do not run TimescaleDB migrations against singleton data…
Jan 22, 2021
2263427
internal/database: make "single database" usage explicit
Jan 26, 2021
701851a
update MigrateDB callers (+ more type safety)
Jan 26, 2021
e44e1b2
fixup
Jan 26, 2021
54b2835
fix DB names (caught by test)
Jan 27, 2021
53e33ec
store WIP
Jan 27, 2021
6729785
Merge branch 'sg/insights-dbtesting' into codeinsights
Jan 27, 2021
2d12aa1
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 27, 2021
306c5d4
Merge remote-tracking branch 'origin/main' into sg/insights-dbtesting
Jan 27, 2021
609cbb8
Merge remote-tracking branch 'origin/main' into sg/insights-dbtesting
Jan 27, 2021
5fb7f14
fix typo
Jan 27, 2021
2272189
Merge branch 'sg/insights-dbtesting' into codeinsights
Jan 27, 2021
5e5aa5a
fixup store+resolvers
Jan 27, 2021
e9daa41
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 27, 2021
fe5f104
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 27, 2021
69e43b8
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 28, 2021
b3beb76
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 28, 2021
7cc9030
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 28, 2021
ef43acc
generate
Jan 28, 2021
b1583dd
settings schema
Jan 28, 2021
04117f2
graphql schema: add Connection
Jan 28, 2021
8aae68d
update GraphQL stubs to match new schema
Jan 29, 2021
dc06e67
store: fix test
Jan 29, 2021
b1026e4
improve testing experience
Jan 29, 2021
aa3be53
settings schema fixup
Jan 29, 2021
6ad3eb9
insights: resolvers: fetch insights from global user settings
Jan 29, 2021
1c53ca9
resolvers: implement TotalCount
Jan 29, 2021
36285a6
resolvers: pass through series schema/settings data
Jan 29, 2021
c5050c6
resolvers: keep track of which tests are needed
Jan 29, 2021
fdb48f5
rename gauge_events -> series_points
Jan 29, 2021
e5eb37d
store: fetch data points
Jan 29, 2021
f367fe9
DB schema: add series_id
Jan 29, 2021
1f5fec8
store: add series_id filtering
Jan 29, 2021
41dc093
migrations: generate
Jan 29, 2021
87e4934
README: inserting fake data; global settings
Jan 29, 2021
6615f35
insights: store/resolvers: implement series data point querying
Jan 29, 2021
e7ded9c
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 29, 2021
95719d2
Merge remote-tracking branch 'origin/main' into codeinsights
Jan 29, 2021
8db18bd
WIP repo-updater integration
Jan 30, 2021
145c688
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 1, 2021
4d14ef4
WIP
Feb 3, 2021
92b53ce
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 5, 2021
5d2ab43
insights: add GraphQL resolvers + extensive tests
Feb 8, 2021
501614a
gofmt
Feb 8, 2021
440dca5
Merge branch 'sg/insights-resolvers' into codeinsights
Feb 8, 2021
39bab99
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 8, 2021
e1819fb
insights: add initial DB schema
Feb 8, 2021
8313187
Merge branch 'sg/insights-schema' into codeinsights
Feb 8, 2021
7c0e04a
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 9, 2021
8866722
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 10, 2021
92ace47
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 10, 2021
5f76b74
fix merge conflicts
Feb 10, 2021
2459b5c
Merge branch 'sg/insights-worker' into codeinsights
Feb 10, 2021
274a5cb
background WIP
Feb 10, 2021
29e0c41
worker WIP
Feb 10, 2021
ae76295
background WIP
Feb 10, 2021
17d6e26
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 10, 2021
2e52637
Update README.codeinsights.md
Feb 12, 2021
e3a4e21
insights: store: query metadata & other minor improvements
Feb 12, 2021
139e77d
update resolver data
Feb 12, 2021
70c85f8
insights: store: add support for recording data points
Feb 12, 2021
07b2634
go generate ./enterprise/internal/insights/store/ (regenerate mocks)
Feb 13, 2021
c645642
go test -update
Feb 13, 2021
b1afbf9
insights: add new discovery package for locating insights
Feb 13, 2021
c0b23b3
insights: resolvers: use the new discovery package
Feb 13, 2021
bd4f455
gofmt
Feb 13, 2021
b1a847b
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 13, 2021
428a2e2
Merge branch 'sg/insights-store-inserts' into codeinsights
Feb 13, 2021
ee4d0a0
Merge branch 'sg/insights-discovery' into codeinsights
Feb 13, 2021
3deea82
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 16, 2021
066fbcf
fix merge conflicts
Feb 16, 2021
6fdcbdc
Merge remote-tracking branch 'origin/main' into codeinsights
Feb 17, 2021
b007308
Update README.codeinsights.md
Feb 18, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
218 changes: 218 additions & 0 deletions README.codeinsights.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
## codeinsights-db Docker Image

We republish the TimescaleDB (open source) Docker image under sourcegraph/codeinsights-db to ensure it uses our standard naming and versioning scheme. This is done in `docker-images/codeinsights-db/`.

## Getting a psql prompt (dev server)

```sh
docker exec -it codeinsights-db psql -U postgres
```

## Migrations

Since TimescaleDB is just Postgres (with an extension), we use the same SQL migration framework we use for our other Postgres databases. `migrations/codeinsights` contains the migrations for the Code Insights database, they are executed when the frontend starts up (as is the same with e.g. codeintel DB migrations.)

### Add a new migration

To add a new migration, use:

```
./dev/db/add_migration.sh codeinsights MIGRATION_NAME
```

See [migrations/README.md](migrations/README.md) for more information

# Random stuff

## Upsert repo names

```
WITH e AS(
INSERT INTO repo_names(name)
VALUES ('github.com/gorilla/mux-original')
ON CONFLICT DO NOTHING
RETURNING id
)
SELECT * FROM e
UNION
SELECT id FROM repo_names WHERE name='github.com/gorilla/mux-original';

WITH e AS(
INSERT INTO repo_names(name)
VALUES ('github.com/gorilla/mux-renamed')
ON CONFLICT DO NOTHING
RETURNING id
)
SELECT * FROM e
UNION
SELECT id FROM repo_names WHERE name='github.com/gorilla/mux-renamed';
```

## Upsert event metadata

Upsert metadata, getting back ID:

```
WITH e AS(
INSERT INTO metadata(metadata)
VALUES ('{"hello": "world", "languages": ["Go", "Python", "Java"]}')
ON CONFLICT DO NOTHING
RETURNING id
)
SELECT * FROM e
UNION
SELECT id FROM metadata WHERE metadata='{"hello": "world", "languages": ["Go", "Python", "Java"]}';
```

## Inserting gauge events

```
INSERT INTO series_points(
time,
value,
metadata_id,
repo_id,
repo_name_id,
original_repo_name_id
) VALUES(
now(),
0.5,
(SELECT id FROM metadata WHERE metadata = '{"hello": "world", "languages": ["Go", "Python", "Java"]}'),
2,
(SELECT id FROM repo_names WHERE name = 'github.com/gorilla/mux-renamed'),
(SELECT id FROM repo_names WHERE name = 'github.com/gorilla/mux-original')
);
```

## Inserting fake data

```
INSERT INTO series_points(
time,
value,
metadata_id,
repo_id,
repo_name_id,
original_repo_name_id)
SELECT time,
random()*80 - 40,
(SELECT id FROM metadata WHERE metadata = '{"hello": "world", "languages": ["Go", "Python", "Java"]}'),
2,
(SELECT id FROM repo_names WHERE name = 'github.com/gorilla/mux-renamed'),
(SELECT id FROM repo_names WHERE name = 'github.com/gorilla/mux-original')
FROM generate_series(TIMESTAMP '2020-01-01 00:00:00', TIMESTAMP '2020-06-01 00:00:00', INTERVAL '10 min') AS time;
```

## Querying all data

```
SELECT series_id,
time,
value,
m.metadata,
repo_id,
repo_name.name,
original_repo_name.name
FROM series_points p
INNER JOIN metadata m ON p.metadata_id = m.id
INNER JOIN repo_names repo_name on p.repo_name_id = repo_name.id
INNER JOIN repo_names original_repo_name on p.original_repo_name_id = original_repo_name.id
ORDER BY time DESC;
```

## Example Global Settings

```
"insights": [
{
"title": "fmt usage",
"description": "fmt.Errorf/fmt.Printf usage",
"series": [
{
"label": "fmt.Errorf",
"search": "errorf",
},
{
"label": "printf",
"search": "fmt.Printf",
}
]
}
]
```

## Query data

### All data

```
SELECT * FROM series_points ORDER BY time DESC LIMIT 100;
```

### Filter by repo name, returning metadata (may be more optimally queried separately)

```
SELECT *
FROM series_points
JOIN metadata ON metadata.id = metadata_id
WHERE repo_name_id IN (
SELECT id FROM repo_names WHERE name ~ '.*-renamed'
)
ORDER BY time
DESC LIMIT 100;
```

### Filter by metadata containing `{"hello": "world"}`

```
SELECT *
FROM series_points
JOIN metadata ON metadata.id = metadata_id
WHERE metadata @> '{"hello": "world"}'
ORDER BY time
DESC LIMIT 100;
```

### Filter by metadata containing Go languages

```
SELECT *
FROM series_points
JOIN metadata ON metadata.id = metadata_id
WHERE metadata @> '{"languages": ["Go"]}'
ORDER BY time
DESC LIMIT 100;
```

See https://www.postgresql.org/docs/9.6/functions-json.html for more operator possibilities. Only ?, ?&, ?|, and @> operators are indexed (gin index)

### Get average/min/max value every 1h for

```
SELECT
value,
time_bucket(INTERVAL '1 hour', time) AS bucket,
AVG(value),
MAX(value),
MIN(value)
FROM series_points
GROUP BY value, bucket;
```

Note: This is not optimized, we can use materialized views to do continuous aggregation.

See https://docs.timescale.com/latest/using-timescaledb/continuous-aggregates

## Why aren't insights being recorded?

Find insights background worker logs:

```
kubectl --namespace=prod logs repo-updater-76df6f4646-q92nx repo-updater | grep insights
```

## Get a psql prompt (Kubernetes)

```
kubectl -n prod exec -it codeinsights-db-5f5977f74d-8q9nl -- psql -U postgres
```
1 change: 1 addition & 0 deletions dev/db/squash_migrations.sh
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ DBNAME='squasher'
SERVER_VERSION=$(psql --version)

if [ "${SERVER_VERSION}" != 9.6 ]; then
# TODO: handling of timescaledb
echo "running PostgreSQL 9.6 in docker since local version is ${SERVER_VERSION}"
docker image inspect postgres:9.6 >/dev/null || docker pull postgres:9.6
docker rm --force "${DBNAME}" 2>/dev/null || true
Expand Down
1 change: 1 addition & 0 deletions dev/drop-entire-local-database-and-redis.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@

psql -c "drop schema public cascade; create schema public;"
redis-cli -c flushall
rm -rf $HOME/.sourcegraph-dev/data/codeinsights-db/
1 change: 1 addition & 0 deletions internal/database/schemadoc/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ func mainLocal() error {
}

func mainContainer() error {
// TODO: need handling for TimescaleDB here.
logger.Printf("Running PostgreSQL 9.6 in docker")

prefix, shutdown, err := startDocker()
Expand Down