-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when trying to connect RDS #40
Comments
Can you share \d my_table from psql? If you don't want to share pubiclally On Wednesday, May 11, 2016, Serdar Dogruyol notifications@github.com
|
@will emailed! |
Got it thanks.
and finally if there isn't anything obvious module PQ
class Connection
def read(frame_type)
size = read_i32
slice = read_bytes(size - 4)
frame = Frame.new(frame_type.not_nil!, slice).tap { |f| p f }
handle_async_frames(frame) ? read : frame
end
end
end that will monkey patch in printing every frame from the wire, which could help track where the failure is. That you might also want to put in a secret gist |
Hey there,
Even with monkey patching with above i can't see any stacktrace 😢 |
That makes me think it's dying in ssl negotiation. Try this? module PG
class Connection
private def negotiate_ssl
puts "in negotiate" # new
write_i32 8
write_i32 80877103
@soc.flush
puts "sent ssl request" # new
c = @soc.read_char
puts "response #{c.inspect}" # new
serv_ssl = case c
when 'S' then true
when 'N' then false
else
raise ConnectionError.new(
"Unexpected SSL response from server: #{c.inspect}")
end
pp serv_ssl #new
if serv_ssl
@soc = OpenSSL::SSL::Socket.new(@soc, sync_close: true)
end
if @conninfo.sslmode == :require && !@soc.is_a?(OpenSSL::SSL::Socket)
close
raise ConnectionError.new("sslmode=require and server did not establish SSL")
end
end
end
end |
Unfortunately even this gives no output, only segfault :/ |
You could get a backtrace for the segfault. At least on mac osx it's show in the Console app. I believe you can do the same in linux with gdb. |
@asterite how can i get that backtrace on OS X? |
Oh, actually... since we now have a sigfault handler that should appear when the program crashes, though I'm not sure it's as complete as the one Console used to show... |
@asterite Well i'm just getting this Is there any way to find the root? |
There's
|
I made an rds instance.
|
@will thanks a lot. Here's lldb stacktrace.
|
|
Oops. Seems we are missing a null check there. |
Also |
Nice catch 👍 |
I guess at that time we wanted to try to see if it works but didn't check for null... in any place 😊 |
I'll add the null checks and raise OpenSSL::SSL::Error. Sounds good? (or you can send a PR with this, I'm about to leave for now) |
@asterite that sounds good, I won't do a pr, but I don’t know why the |
@will Nope, it won't work. I don't know about openssl and the man doesn't seem to be very helpful: "The creation of a new SSL_CTX object failed. Check the error stack to find out the reason." |
Oh that reminds me of a terrible bug @uhoh-itsmaciek found with ruby and the ssl error stack. If you don't clear it after errors it just grows forever or something. |
http://www.educatedguesswork.org/2005/03/curse_you_opens.html
|
Yeah, I filed https://bugs.ruby-lang.org/issues/7215 with Ruby about this but I don't think it's gonna get fixed there. @petergeoghegan has a patch to Postgres with some workarounds for 9.6, and from what I can tell it is getting back-patched to older releases, too. Of course, this doesn't help with RDS if that is indeed the issue here. |
That fix will be in all the point releases coming out this week, FWIW.
|
@asterite meanwhile this issue doesn't exist in Crystal 0.15.0 |
@sdogruyol interesting. I'll rollback to Crystal 0.15.0 and see if it goes away for me. I am another datapoint for |
@datachomp @sdogruyol very interesting. I assume it's the same version of openssl that is being linked each time? |
@will i think so. How can we be sure? |
on os x |
Here's the working one with Crystal 0.15.0
The segfault one
|
Looks like the same libssl, which makes sense. Looking at the diff in crystal between 0.15 and 0.16, nothing is jumping out to me
diff --git a/src/openssl/bio.cr b/src/openssl/bio.cr
index eb83135..b6fcd8b 100644
--- a/src/openssl/bio.cr
+++ b/src/openssl/bio.cr
@@ -48,11 +48,9 @@ struct OpenSSL::BIO
crystal_bio
end
- @io : IO
@boxed_io : Void*
- @bio : LibCrypto::Bio*
- def initialize(@io)
+ def initialize(@io : IO)
@bio = LibCrypto.bio_new(pointerof(CRYSTAL_BIO))
# We need to store a reference to the box because it's
diff --git a/src/openssl/cipher.cr b/src/openssl/cipher.cr
index 94ea0ad..8f8ecf7 100644
--- a/src/openssl/cipher.cr
+++ b/src/openssl/cipher.cr
@@ -4,8 +4,6 @@ class OpenSSL::Cipher
class Error < OpenSSL::Error
end
- @ctx : Pointer(Void)?
-
def initialize(name)
cipher = LibCrypto.evp_get_cipherbyname name
raise ArgumentError.new "unsupported cipher algorithm #{name.inspect}" unless cipher
diff --git a/src/openssl/digest/digest.cr b/src/openssl/digest/digest.cr
index cf6d390..c627655 100644
--- a/src/openssl/digest/digest.cr
+++ b/src/openssl/digest/digest.cr
@@ -10,7 +10,6 @@ module OpenSSL
include DigestBase
getter name : String
- @ctx : LibCrypto::EVP_MD_CTX
def initialize(@name, @ctx : LibCrypto::EVP_MD_CTX)
raise Error.new("Invalid EVP_MD_CTX") unless @ctx
diff --git a/src/openssl/hmac.cr b/src/openssl/hmac.cr
index 4ab5472..b637b87 100644
--- a/src/openssl/hmac.cr
+++ b/src/openssl/hmac.cr
@@ -1,7 +1,7 @@
require "./lib_crypto"
class OpenSSL::HMAC
- def self.digest(algorithm : Symbol, key, data)
+ def self.digest(algorithm : Symbol, key, data) : Slice(UInt8)
evp = case algorithm
when :dss then LibCrypto.evp_dss
when :dss1 then LibCrypto.evp_dss1
@@ -23,7 +23,7 @@ class OpenSSL::HMAC
buffer[0, buffer_len.to_i]
end
- def self.hexdigest(algorithm : Symbol, key, data)
+ def self.hexdigest(algorithm : Symbol, key, data) : String
digest(algorithm, key, data).hexstring
end
end
diff --git a/src/openssl/md5.cr b/src/openssl/md5.cr
index 7941351..f218907 100644
--- a/src/openssl/md5.cr
+++ b/src/openssl/md5.cr
@@ -1,11 +1,11 @@
require "./lib_crypto"
class OpenSSL::MD5
- def self.hash(data : String)
+ def self.hash(data : String) : UInt8[16]
hash(data.to_unsafe, data.bytesize)
end
- def self.hash(data : UInt8*, bytesize : Int)
+ def self.hash(data : UInt8*, bytesize : Int) : UInt8[16]
buffer = uninitialized UInt8[16]
LibCrypto.md5(data, bytesize, buffer)
buffer
diff --git a/src/openssl/pkcs5.cr b/src/openssl/pkcs5.cr
index a25cc3e..06a8df5 100644
--- a/src/openssl/pkcs5.cr
+++ b/src/openssl/pkcs5.cr
@@ -1,7 +1,7 @@
require "./openssl"
module OpenSSL::PKCS5
- def self.pbkdf2_hmac_sha1(secret, salt, iterations = 2**16, key_size = 64)
+ def self.pbkdf2_hmac_sha1(secret, salt, iterations = 2**16, key_size = 64) : Slice(UInt8)
buffer = Slice(UInt8).new(key_size)
if LibCrypto.pkcs5_pbkdf2_hmac_sha1(secret, secret.bytesize, salt, salt.bytesize, iterations, key_size, buffer) != 1
raise OpenSSL::Error.new "pkcs5_pbkdf2_hmac"
diff --git a/src/openssl/ssl/context.cr b/src/openssl/ssl/context.cr
index 80d707b..481b069 100644
--- a/src/openssl/ssl/context.cr
+++ b/src/openssl/ssl/context.cr
@@ -1,10 +1,9 @@
class OpenSSL::SSL::Context
- @@default : OpenSSL::SSL::Context?
-
- def self.default
+ def self.default : self
@@default ||= new
end
+ # Do not remove this until version > 0.15.0, it's needed in 0.15.0
@handle : LibSSL::SSLContext
def initialize
diff --git a/src/openssl/ssl/socket.cr b/src/openssl/ssl/socket.cr
index 0da517f..9cca718 100644
--- a/src/openssl/ssl/socket.cr
+++ b/src/openssl/ssl/socket.cr
@@ -1,9 +1,6 @@
class OpenSSL::SSL::Socket
include IO
- @ssl : LibSSL::SSL
- @bio : OpenSSL::BIO
-
# If `sync_close` is true, closing this socket will
# close the underlying IO.
property? sync_close : Bool |
How about |
That's for gzip, I don’t think that'd cause anything seen here. |
Yeah. The diff is also looking fine. Weird |
@sdogruyol @will If you want you can email me any data so I can try to reproduce this and see what's wrong |
I'll send you creds to a db that reproduces this tomorrow On Monday, May 16, 2016, Ary Borenszweig notifications@github.com wrote:
|
So, it turned out that the segfault only happens if you create a connection in a constant. The problem is that these four lines are considered main code, and main code is executed before constants (and class vars) are initialized. I don't know what's the general solution to this is, but for now I pushed a workaround. I'll probably release 0.17.1 tomorrow with this fix. |
Just tried with 0.17.1 and it works beautifully. Thank you so much to everyone involved! |
0.17.1 is out and 0.17.3 will be out soon with another fix, so I'm going to close this now. Thanks for reporting! |
I'm trying to connect to my RDS instance with credentials like below.
It just segfaults without any stacktrace.
I'm clueless and need some help :/
The text was updated successfully, but these errors were encountered: