-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
res_ninit() requires cleanup with res_nclose() #12
Conversation
@Mergifyio rebase |
✅ Branch has been successfully rebased |
@lyokha : Could you suggest a reviewer please? (I do not have the necessary domain knowledge.) |
I would recommend @ivanmurashko. He is an excellent engineer in this area. |
@ivanmurashko, would you help out and review these changes? |
will do! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do with earlier and fewer placements of c_res_nclose
?
src/Network/DNS.hs
Outdated
fail "res_query(3) message size overflow" | ||
|
||
errno <- getErrno | ||
|
||
when (reslen < 0) $ do | ||
unless (errno == eOK) $ | ||
unless (errno == eOK) $ do | ||
c_res_nclose stptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't this call to c_res_nclose
be placed directly after getErrno
, rather than 3 times in each branch of the case distinction over reslen
and errno
?
If it does not affect getErrno
, it could even be placed directly after c_res_query
, could it?
(Same in the other occurrences below.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are probably right, I'll test this.
@andreasabel, regarding your comment about excessive use of diff --git a/cbits/hs_resolv.h b/cbits/hs_resolv.h
index 37d486c..904b7d7 100644
--- a/cbits/hs_resolv.h
+++ b/cbits/hs_resolv.h
@@ -75,6 +75,14 @@ hs_res_query(struct __res_state *s, const char *dname, int class, int type, unsi
return res_nquery(s, dname, class, type, answer, anslen);
}
+inline static void
+hs_res_close(struct __res_state *s)
+{
+ assert(s);
+
+ res_nclose(s);
+}
+
#else
/* use non-reentrant API */
@@ -123,6 +131,11 @@ hs_res_query(void *s, const char *dname, int class, int type, unsigned char *ans
return res_query(dname, class, type, answer, anslen);
}
+inline static void
+hs_res_close(void *s)
+{
+}
+
#endif
#endif /* HS_RESOLV_H */
diff --git a/src/Network/DNS.hs b/src/Network/DNS.hs
index 5ae0828..1162155 100644
--- a/src/Network/DNS.hs
+++ b/src/Network/DNS.hs
@@ -152,6 +152,8 @@ queryRaw (Class cls) (Name name) qtype = withCResState $ \stptr -> do
resetErrno
reslen <- c_res_query stptr dn (fromIntegral cls) qtypeVal resptr max_msg_size
+ c_res_close stptr
+
unless (reslen <= max_msg_size) $
fail "res_query(3) message size overflow"
@@ -188,6 +190,8 @@ sendRaw req = withCResState $ \stptr -> do
resetErrno
reslen <- c_res_send stptr reqptr (fromIntegral reqlen) resptr max_msg_size
+ c_res_close stptr
+
unless (reslen <= max_msg_size) $
fail "res_send(3) message size overflow"
@@ -256,6 +260,8 @@ mkQueryRaw (Class cls) (Name name) qtype = withCResState $ \stptr -> do
resetErrno
reslen <- c_res_mkquery stptr dn (fromIntegral cls) qtypeVal resptr max_msg_size
+ c_res_close stptr
+
unless (reslen <= max_msg_size) $
fail "res_mkquery(3) message size overflow"
diff --git a/src/Network/DNS/FFI.hs b/src/Network/DNS/FFI.hs
index cb6953b..2ee5278 100644
--- a/src/Network/DNS/FFI.hs
+++ b/src/Network/DNS/FFI.hs
@@ -68,3 +68,6 @@ foreign import capi safe "hs_resolv.h res_opt_set_use_dnssec" c_res_opt_set_use_
-- int hs_res_mkquery(void *, const char *dname, int class, int type, unsigned char *req, int reqlen0);
foreign import capi safe "hs_resolv.h hs_res_mkquery" c_res_mkquery :: Ptr CResState -> CString -> CInt -> CInt -> Ptr CChar -> CInt -> IO CInt
+-- void hs_res_close(void *);
+foreign import capi safe "hs_resolv.h hs_res_close" c_res_close :: Ptr CResState -> IO ()
+ Note that I also renamed I also made a small test to demonstrate the memory leak. {-# LANGUAGE OverloadedStrings #-}
module Main where
import Network.DNS
import Control.Monad
import Control.Concurrent
main :: IO ()
main = foldM_ (const $ const $ do
v <- queryA (Name "www.google.com")
print v
threadDelay 1000
) () [1 .. 1000000] This is a small program to quickly run
I also managed to build the old interface ( If it looks good to you, I will add this improvements in a squashed commit. |
Yes, please go ahead. I was also wondering if the call to |
Yes, diff --git a/cbits/hs_resolv.h b/cbits/hs_resolv.h
index 37d486c..904b7d7 100644
--- a/cbits/hs_resolv.h
+++ b/cbits/hs_resolv.h
@@ -75,6 +75,14 @@ hs_res_query(struct __res_state *s, const char *dname, int class, int type, unsi
return res_nquery(s, dname, class, type, answer, anslen);
}
+inline static void
+hs_res_close(struct __res_state *s)
+{
+ assert(s);
+
+ res_nclose(s);
+}
+
#else
/* use non-reentrant API */
@@ -123,6 +131,11 @@ hs_res_query(void *s, const char *dname, int class, int type, unsigned char *ans
return res_query(dname, class, type, answer, anslen);
}
+inline static void
+hs_res_close(void *s)
+{
+}
+
#endif
#endif /* HS_RESOLV_H */
diff --git a/src/Network/DNS/FFI.hs b/src/Network/DNS/FFI.hs
index cb6953b..0490ac5 100644
--- a/src/Network/DNS/FFI.hs
+++ b/src/Network/DNS/FFI.hs
@@ -4,6 +4,7 @@
module Network.DNS.FFI where
import Control.Concurrent.MVar
+import Control.Exception (finally)
import Foreign.C
import Foreign.Marshal.Alloc
import Foreign.Ptr
@@ -49,7 +50,7 @@ withCResState :: (Ptr CResState -> IO a) -> IO a
withCResState act
| resIsReentrant = allocaBytes (fromIntegral sizeOfResState) $ \ptr -> do
_ <- c_memset ptr 0 sizeOfResState
- act ptr
+ act ptr `finally` c_res_close ptr
| otherwise = withMVar resolvLock $ \() -> act nullPtr
@@ -68,3 +69,6 @@ foreign import capi safe "hs_resolv.h res_opt_set_use_dnssec" c_res_opt_set_use_
-- int hs_res_mkquery(void *, const char *dname, int class, int type, unsigned char *req, int reqlen0);
foreign import capi safe "hs_resolv.h hs_res_mkquery" c_res_mkquery :: Ptr CResState -> CString -> CInt -> CInt -> Ptr CChar -> CInt -> IO CInt
+-- void hs_res_close(void *);
+foreign import capi safe "hs_resolv.h hs_res_close" c_res_close :: Ptr CResState -> IO ()
+ Note that I added |
There is a logical problem with |
Thanks for investigating. If that is the case, then it is better not to bundle it with |
@andreasabel, what if I introduce function withResInit :: Ptr CResState -> IO a -> IO a
withResInit stptr act = do
rc1 <- c_res_opt_set_use_dnssec stptr
unless (rc1 == 0) $
fail "res_init(3) failed"
resetErrno
act `finally` c_res_close stptr So that queryRaw :: Class -> Name -> Type -> IO BS.ByteString
queryRaw (Class cls) (Name name) qtype = withCResState $ \stptr -> do
allocaBytes max_msg_size $ \resptr -> do
_ <- c_memset resptr 0 max_msg_size
BS.useAsCString name $ \dn -> do
withResInit stptr $ do
reslen <- c_res_query stptr dn (fromIntegral cls) qtypeVal resptr max_msg_size
unless (reslen <= max_msg_size) $
fail "res_query(3) message size overflow"
errno <- getErrno
when (reslen < 0) $ do
unless (errno == eOK) $
throwErrno "res_query"
fail "res_query(3) failed"
BS.packCStringLen (resptr, fromIntegral reslen) Now finalization goes after |
Excellent structure! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great now!
Not yet! I just pushed another protection against asynchronous exceptions: see details in the commit. |
src/Network/DNS/FFI.hs
Outdated
fail "res_init(3) failed" | ||
resetErrno | ||
act `finally` c_res_close stptr | ||
withCResInit stptr act = flip finally (c_res_close stptr) $ do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused now. Shouldn't c_res_close
only be called after c_res_opt_set_use_dnssec
was successful?
I thought this was the reason not to integrate it directly into withCResState
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c_res_close
is actually safe to call inside withCResState
, but this is only lucky coincidence because we do
_ <- c_memset ptr 0 sizeOfResState
inside withCResState
after allocation of the state pointer. In this case (all fields, e.g. pointers, etc., are zeros), c_rec_close
is effectively no-op and thus safe. But logically, it must be called after succesful res_ninit()
.
By wrapping the whole withCResInit
in finally $ cres_close ptr
I use the same lucky assumption as could be used for withResCState
, but with withCResInit
this does not expand too far beyond the place of call to res_ninit()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... not exactly no-op, see glibc implementation here, but at least statp->nscount
will be 0 and no free
happen (inside function __res_iclose()
. Though I worry about possible call to __close_nocacnel_nostatus()
(because statp->_vsocks
is 0) and __resolv_conf_detach()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You really should structure this using bracket
and not finally
.
I think that what @andreasabel was asking.
If res_ninit
fails, one should assume there isn't anything to close/ cleanup. Docs don't say either way, especially they don't say that it's safe to call res_close
on pointer which is has failed to res_ninit
. One should not look into documentation to figure out whether it's safe or not, as there is no contract (= docs) it won't change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... so probably it's safer to keep to the 2nd commit, though it makes possible to leak memory in environments with asynchronous exceptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In bracket
initialization is run with asynchronous exception masked. I don't understand how
bracket init c_res_close $ \sptr' -> act sptr' where
init = do
rc1 <- c_res_opt_set_use_dnssec stptr
unless (rc1 == 0) $ fail "res_init(3) failed"
resetErrno
return sptr
would leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that bracket won't leak, just tested with this:
withCResInit :: Ptr CResState -> IO a -> IO a
withCResInit stptr act = bracket
(do
rc1 <- c_res_opt_set_use_dnssec stptr
unless (rc1 == 0) $
fail "res_init(3) failed"
return ()
)
(const $ c_res_close stptr)
(const $ do
resetErrno
act
)
I'll commit this if you don't mind against style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return ()
was excessive after unless
but I already committed
@lyokha: I reverted some of your cosmetic changes (redundant |
Ok, I think the PR is ready, the only little concern is that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, from my side, this PR is finished as well. I think the final result is very nice, giving more structure to the existing code.
@phadej , this is now over to you, unless you give me hackage rights, in which case I would release this. |
Ping @phadej : please add me as hackage maintainer. |
@andreasabel You could just start the second step in https://wiki.haskell.org/Taking_over_a_package procedure, a month is reasonable time in my mind. |
that were described and fixed in haskell-hvr/resolv#12
@Mergifyio rebase |
Linux man 3 resolver says: Every call to res_ninit() requires a corresponding call to res_nclose() to free memory allocated by res_ninit() and subsequent calls to res_nquery().
withCResInit which makes sure that allocated resources get correctly released by wrapping actions following a successful call to c_res_init in finally handler.
This protects against asynchronous exceptions. If an asynchronous exception happens somewhere inside of rc1 <- c_res_opt_set_use_dnssec stptr unless (rc1 == 0) $ fail "res_init(3) failed" resetErrno then it may leak memory after successful call to res_ninit() inside c_res_opt_set_use_dnssec. Note that it is safe to run c_res_close stptr if all fields of stptr are memzeroed (which is our case before the call to res_ninit()): this means that wrapping the entire body of withCResInit is safe even if an asynchronous exception occurs before the call to res_ninit().
✅ Branch has been successfully rebased |
Rebased onto master to test with GHC 9.6.1 (latest CI). PR should be SQUASHED. CI passes. @lyokha, are you happy with this PR in this form? Then we can merge it (by squashing). |
@andreasabel I think that it's ok. I will PR the configure warning fix after this. |
@andreasabel Could you do the merge? I am not familiar with Mergify at all :( |
Sure, did it. |
Alexey Radkov ***@***.***> writes:
Linux man 3 resolver says:
Every call to res_ninit() requires a corresponding call to
res_nclose() to free memory allocated by res_ninit() and subsequent
calls to res_nquery().
For the record, when I originally implemented `resolv` the "man-pages"
distribution
(c.f. https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/Archive/)
version 5.00 did not mention `res_nclose(3)` at all, so I simply
didn't know about the statement quoted above even though I diligently
consulted the man pages at the time.
As such, this knowledge gap was likely a wide-spread issue as Linux
distributions such as Debian suffered from this transitively
(c.f. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=909794) for a
long time, possibly resulting in a few memory leakage bugs out there in
older code written against the man-pages which described `res_ninit(3)`
before that critical relationship to `res_nclose(3)` was documented as
well.
|
Linux man 3 resolver says:
Every call to res_ninit() requires a corresponding call to
res_nclose() to free memory allocated by res_ninit() and subsequent
calls to res_nquery().
I use the library for periodic DNS polls from multiple async threads. Before the fix it was leaking.