-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flush: fix error leak when flushing multiple messages in a lasting connection #239
Conversation
Hi @stapelberg and @jronak, please note that this PR's description demonstrates an issue that is not caused or related to First of all check the output shown after running the repro code:
The error is not the same as the one seen in buffer overflow:
The error in this PR relates to the set or table not being present in nftables. I do not see from the description how exactly was the table and set created before the repro code was executed, but my current assumption is that there was an error in creation of either table or set. The repro code works for me if I properly create the table and set:
I assume that there is no bug with the library, just the code using it is not properly written. I think it is best for @jronak to give more details about the issue he is experiencing. |
Sorry for the late reply. This issue isn't related to the buffer overflow issue from @turekt. This issue is related to reusing lasting connection after failure on Previous repro code wasn't clear enough, created something much more simpler - repropackage main
import (
"fmt"
"github.com/google/nftables"
)
func main() {
conn, err := nftables.New(nftables.AsLasting())
if err != nil {
panic(err)
}
defer conn.CloseLasting()
fmt.Println("Flushing and updating a non-existent set which must fail")
if err := flushAndUpdateSet(conn); err != nil {
fmt.Printf("[expected] Failed to flushAndUpdateSet: %v\n", err)
}
fmt.Println("Create a table must pass")
if err := createTable(conn); err != nil {
fmt.Printf("[unexpected] Failed to createTable: %v\n", err)
}
}
func createTable(conn *nftables.Conn) error {
table := &nftables.Table{
Family: nftables.TableFamilyIPv4,
Name: "test-table",
}
conn.AddTable(table)
return conn.Flush()
}
func flushAndUpdateSet(conn *nftables.Conn) error {
set := &nftables.Set{
Name: "non-existent-set",
Table: &nftables.Table{
Name: "non-existent-table",
Family: nftables.TableFamilyIPv4,
},
KeyType: nftables.TypeInteger,
}
conn.FlushSet(set)
if err := conn.SetAddElements(set, []nftables.SetElement{{Key: []byte{0x01, 0x00, 0x00, 0x00}}}); err != nil {
return err
}
return conn.Flush()
} Repro code does the following:
By running the repo code, you notice both ops failed:
So repro code produces error for both (2) and (3). When you check in nft ruleset (3) actually succeeded, but repro code returned an error. This happens as errors from (2) were not fully drained from the connection. |
When you flush multiple messages/ops on a connection, and if flush fails to apply, the netlink connection returns errors per command. Since we are returning on noticing the first error, the rest of the errors are buffered and leaks into the result of next flush. This pull request invokes `conn.Receive()` * number of messages to drain any buffered errors in the connection.
Hi @jronak, thanks for additional reproduction code. Now it all makes sense. Your repro basically accumulates two netlink messages in
Since both messages (2574 and 2572) are sent with one In a non-lasting connection this was not a problem since the second error for 2572 would drop from the buffer as soon as the netlink request messages are sent and connection closes, so your PR makes perfect sense. LGTM |
Thanks everyone! |
This issue is related to reusing lasting connection after failure on
flush
.You start a lasting connection, where you buffer 2+ messages that will fail, then flush these messages which results in failure. But when you reuse this connection to perform an operation which will pass, but
flush
returns back an error although the operation has passed on checking directly withnft
. Turns out the error was not fully drained from the previous message which leaks into next operation. See pro code to understand -repro
Repro code does the following:
By running the repo code, you notice both ops failed:
So repro code produces error for both (2) and (3). When you check in nft ruleset (3) actually succeeded, but repro code returned an error. This happens as errors from (2) were not fully drained from the connection.