Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault error in gemini inside the docker instance #282

Closed
temichus opened this issue Apr 14, 2023 · 2 comments · Fixed by #287
Closed

segfault error in gemini inside the docker instance #282

temichus opened this issue Apr 14, 2023 · 2 comments · Fixed by #287

Comments

@temichus
Copy link

Issue description

There is a segfault failure in Gemini inside the docker instance during the test gemini-1tb-10h.
at the same time, there is no errors or "signal SIGSEGV" in the loader system.log during Gemini run.

log:

Seed:			97
Maximum duration:	8h0m0s
Warmup duration:	2h0m0s
Concurrency:		50
Test cluster:		[10.12.10.242 10.12.11.102 10.12.10.28]
Oracle cluster:		[10.12.9.60]
Output file:		/gemini/gemini_result_d8452aee-0f69-4000-a871-dc7efe1f737c.log
Schema: {
    "keyspace": {
        "name": "ks1",
        "replication": {
            "class": "SimpleStrategy",
            "replication_factor": "3"
        },
        "oracle_replication": {
            "class": "SimpleStrategy",
            "replication_factor": "1"
        }
    },
    "tables": [
        {
            "name": "table1",
            "partition_keys": [
                {
                    "name": "pk0",
                    "type": "smallint"
                },
                {
                    "name": "pk1",
                    "type": "varint"
                }
            ],
            "clustering_keys": [
                {
                    "name": "ck0",
                    "type": "varchar"
                },
                {
                    "name": "ck1",
                    "type": "uuid"
                },
                {
                    "name": "ck2",
                    "type": "int"
                }
            ],
            "columns": [
                {
                    "name": "col0",
                    "type": "float"
                },
                {
                    "name": "col1",
                    "type": {
                        "kind": "set",
                        "type": "uuid",
                        "frozen": false
                    }
                },
                {
                    "name": "col2",
                    "type": "duration"
                },
                {
                    "name": "col3",
                    "type": "timeuuid"
                },
                {
                    "name": "col4",
                    "type": "double"
                },
                {
                    "name": "col5",
                    "type": {
                        "key_type": "date",
                        "value_type": "uuid",
                        "frozen": false
                    }
                },
                {
                    "name": "col6",
                    "type": "double"
                },
                {
                    "name": "col7",
                    "type": "varchar"
                },
                {
                    "name": "col8",
                    "type": {
                        "types": {
                            "udt_70180814_0": "ascii",
                            "udt_70180814_1": "timestamp",
                            "udt_70180814_2": "float",
                            "udt_70180814_3": "ascii",
                            "udt_70180814_4": "duration",
                            "udt_70180814_5": "int",
                            "udt_70180814_6": "float"
                        },
                        "type_name": "udt_70180814",
                        "frozen": true
                    }
                },
                {
                    "name": "col9",
                    "type": "duration"
                }
            ],
            "indexes": [
                {
                    "name": "table1_col0_idx",
                    "column": "col0",
                    "column_idx": 0
                },
                {
                    "name": "table1_col4_idx",
                    "column": "col4",
                    "column_idx": 4
                },
                {
                    "name": "table1_col6_idx",
                    "column": "col6",
                    "column_idx": 6
                }
            ],
            "materialized_views": [
                {
                    "name": "table1_mv_0",
                    "partition_keys": [
                        {
                            "name": "col3",
                            "type": "timeuuid"
                        },
                        {
                            "name": "pk0",
                            "type": "smallint"
                        },
                        {
                            "name": "pk1",
                            "type": "varint"
                        }
                    ],
                    "clustering_keys": [
                        {
                            "name": "ck0",
                            "type": "varchar"
                        },
                        {
                            "name": "ck1",
                            "type": "uuid"
                        },
                        {
                            "name": "ck2",
                            "type": "int"
                        }
                    ],
                    "NonPrimaryKey": {
                        "name": "col3",
                        "type": "timeuuid"
                    }
                }
            ],
            "known_issues": {
                "https://github.com/scylladb/scylla/issues/3708": true
            }
        }
    ]
}
{"L":"INFO","T":"2023-04-07T03:41:38.134Z","N":"generator","M":"starting partition key generation loop"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x81 pc=0x517ecf]

goroutine 456 [running]:
math/big.nat.setWord(...)
	/usr/local/go/src/math/big/nat.go:77
math/big.nat.setUint64({0x81?, 0x20?, 0xa90a80?}, 0xa90a01?)
	/usr/local/go/src/math/big/nat.go:84 +0x4f
math/big.(*Int).SetInt64(0xc1af030080, 0x45ca9a5b8d7ddb5b)
	/usr/local/go/src/math/big/int.go:55 +0x3f
math/big.NewInt(...)
	/usr/local/go/src/math/big/int.go:69
github.com/scylladb/gemini.SimpleType.GenValue({0xaf98b5, 0x6}, 0xc0000a80a0, {0x2710, 0x0, 0x3e8, 0x0, 0x0})
	/home/ls/repos/gemini_upstream/types.go:297 +0x377
github.com/scylladb/gemini.(*Generator).createPartitionKeyValues(0xc000214000, 0xc000608e50?)
	/home/ls/repos/gemini_upstream/generator.go:159 +0x12e
github.com/scylladb/gemini.(*Generator).start.func1()
	/home/ls/repos/gemini_upstream/generator.go:137 +0x135
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/home/ls/go/pkg/mod/golang.org/x/sync@v0.0.0-20200317015054-43a5402ce75a/errgroup/errgroup.go:57 +0x67
created by golang.org/x/sync/errgroup.(*Group).Go
	/home/ls/go/pkg/mod/golang.org/x/sync@v0.0.0-20200317015054-43a5402ce75a/errgroup/errgroup.go:54 +0x8d

Installation details

Test: gemini-1tb-10h
Test id: e5630ef8-ba07-4017-a52e-e1191e464dd2
Test name: scylla-master/gemini-/gemini-1tb-10h
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor e5630ef8-ba07-4017-a52e-e1191e464dd2
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs e5630ef8-ba07-4017-a52e-e1191e464dd2

Logs:

Jenkins job URL

@roydahan
Copy link
Collaborator

@dkropachev any chance you can take a look at the crashes we have in Gemini?

@fruch
Copy link
Collaborator

fruch commented Apr 23, 2023

Happens again this week, looks like we are leaking memory :

image

Installation details

Kernel Version: 5.19.0-1022-aws
Scylla version (or git commit hash): 5.3.0~dev-20230415.1da02706ddb8 with build-id f7ac5cd90e63ace5065c583d6d1d9c381f39b5c2

Cluster size: 3 nodes (i3.4xlarge)

Scylla Nodes used in this run:

  • gemini-1tb-10h-master-oracle-db-node-08087b94-1 (107.21.57.78 | 10.12.10.56) (shards: 14)
  • gemini-1tb-10h-master-db-node-08087b94-5 (54.242.182.146 | 10.12.10.38) (shards: 14)
  • gemini-1tb-10h-master-db-node-08087b94-4 (54.91.135.97 | 10.12.8.140) (shards: 14)
  • gemini-1tb-10h-master-db-node-08087b94-3 (54.226.251.100 | 10.12.10.164) (shards: 14)
  • gemini-1tb-10h-master-db-node-08087b94-2 (54.91.159.223 | 10.12.11.63) (shards: 14)
  • gemini-1tb-10h-master-db-node-08087b94-1 (67.202.18.69 | 10.12.11.108) (shards: 14)

OS / Image: ami-0501eb17c8c79b6d2 (aws: us-east-1)

Test: gemini-1tb-10h
Test id: 08087b94-c377-4cea-8788-4cb54d2f5263
Test name: scylla-master/gemini-/gemini-1tb-10h
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor 08087b94-c377-4cea-8788-4cb54d2f5263
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 08087b94-c377-4cea-8788-4cb54d2f5263

Logs:

Jenkins job URL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants