You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I wonder if you could please help as I'm encountering issues with this gem in a live project and I'm at a bit of a loss as to what is happening so if anyone can provide any help it would be greately appreciated!
Overview
The FTP integration is used by a number of old weather stations that connect via FTP, over a mobile network, and write 1 line of text every minute. I'm simply using an override for the write_file() function that looks at the filename and then passed the line to be written on to be written to the database. This was working just fine for months until recently when I started to see 'Error Connection reset by peer @ io_fillbuf - fd:{number}' errors. Once these start to appear in the logs, slowly all of the connections start to generate this error until the FTP server refuses to accept connections and the only way that seems to fix it so far is to restart the application, where it will run for a apparently random amount of time from 7 hours to over a day before this starts to happen again, requiring another restart.
More weather stations were added but the total is only 28.
These stations are using mobile network and are kind of slow. Mostly the session is < 1 second from connect to write and close but this can be up to 9 seconds from what I've seen.
Apart from the weather stations connecting in, there's one connection from someone who is connecting in every 5 minutes to read the files for their own purposes, which is some legacy task from a 3rd part system.
There is a miss-configured weather station that tries to connect every minute with the wrong credentials.
There does not appear to be a consistent time frame between restart and the first instance of the error in the logs. Complete lockout (all connections refused) is almost daily though.
What are the settings?
It may be worth mentioning that this is running in a docker environment with mapped ports
The file system uses the default.
Initializers
require 'driver'
# Need to overwrite the write function so we can grab the data
Ftpd::DiskFileSystem.class_eval do
def write_file(ftp_path, stream, mode)
begin
File.open(expand_ftp_path(ftp_path), mode) do |file|
# Get the details from the path
arr_acc_dh = file.path.gsub(FTP_ROOT, "").split("/")
account_id = arr_acc_dh[1]
datahog_id = arr_acc_dh[2]
dh_importer = DataImporter.new()
while line = stream.read
# Ignore anything other than day data
if ftp_path.include? "daydata.txt"
dh_importer.import_data(line, account_id, datahog_id)
elsif ftp_path.include? "rssidata.txt"
dh_importer.import_ssri_data(line, account_id, datahog_id)
end
file.write line
end
end
rescue Exception => e
# Something bad happened
puts "ERROR : #{DateTime.now} : #{Thread.current.object_id} : write_file: #{e.full_message} \n"
end
end
end
# Override to set host and port
Ftpd::Session.class_eval do
def initialize_session
@logged_in = false
@data_type = 'A'
@mode = 'S'
@structure = 'F'
@data_channel_protection_level = :clear
@name_prefix = '/'
# @data_hostname = nil
# @data_port = nil
@data_hostname = '{hostname}'
@data_port = 2157
@protection_buffer_size_set = false
@epsv_all = false
close_data_server_socket
reset_failed_auths
end
end
# Strange error on stat request on live server. Debug attempt
Ftpd::CmdStat.class_eval do
def cmd_stat(argument)
reply "220 sd/ftpd 2.1.0"
end
end
Main execution block
# Set the log output and override to include thread info
skye_logger = Logger.new($stdout) #Logger.new('/var/log/ftpd.log')
skye_logger.formatter = proc do |severity, datetime, progname, msg|
"#{severity} : #{datetime} : #{Thread.current.object_id} : #{msg}\n"
end
driver = Driver.new("/var/www/shared/ftp_storage" , skye_logger) # Pass logger to driver
server = Ftpd::FtpServer.new(driver)
server.log = skye_logger
server.port = 2157
server.interface = '0.0.0.0'
server.nat_ip = '{External Server IP}'
server.passive_ports = Range.new(4021, 4060)
server.allow_low_data_ports = false
server.session_timeout = 45
server.failed_login_delay = 120
server.max_connections = 100
server.max_connections_per_ip = 1
server.tls = :off
server.start
puts "FTP Server listening on port #{server.bound_port}"
puts "FTP Server interface #{server.interface}"
begin
gets
rescue Interrupt
puts "Interrupt"
end`
Driver Code
require 'ftpd'
require "tmpdir"
class Driver
def initialize(ftp_root, logger)
@logger = logger
@ftp_root = ftp_root
@full_datahog_data_path = ''
@logger.debug "passed ftp_root #{ftp_root}"
end
def authenticate(user, password)
begin
# Expect the username to be consisting of 2 numbers account_id and hog_id.
account_id = ''
datahog_id = ''
if user.split("_").count != 2
raise "invalid login. Must consist of account_id and datahog_id e.g. '123_456'"
end
account_id = user.split("_")[0].to_i
datahog_id = user.split("_")[1].to_i
datahog = Datahog.includes(:account).references(:account).where(id: datahog_id).where({account: { id: account_id}}).where({datahogs: { ftp_password: password}}).where(account: {active: true})
if datahog.blank?
@logger.info "Invalid FTP credentials or inactive account. No matching datahog found for user: #{user} and password: #{password}"
false
else
# Create the account directory if it doesn't exist already
ftp_root_with_account = "#{@ftp_root}/#{account_id}"
unless File.exist?(ftp_root_with_account)
FileUtils.mkdir_p(ftp_root_with_account)
end
# Now the datahog directory
self.instance_variable_set(:@full_datahog_path, "#{ftp_root_with_account}/#{datahog_id}")
unless File.exist?(self.instance_variable_get(:@full_datahog_path))
FileUtils.mkdir_p(self.instance_variable_get(:@full_datahog_path))
end
self.instance_variable_set(:@full_datahog_data_path, "#{self.instance_variable_get(:@full_datahog_path)}/data")
unless File.exist?(self.instance_variable_get(:@full_datahog_data_path))
FileUtils.mkdir_p(self.instance_variable_get(:@full_datahog_data_path))
end
@logger.debug "User '#{user}'. instance var full_datahog_data_path : '#{self.instance_variable_get(:@full_datahog_data_path)}'"
true
end
rescue Exception => ex
@logger.error "FTP driver - Authentication failure #{ex.full_message}"
false
end
end
def file_system(user)
Ftpd::DiskFileSystem.new(self.instance_variable_get(:@full_datahog_path))
end
end
Modification to get full all thread stack trace
CommandLoop modified with this function
def read_and_execute_commands
catch :done do
begin
reply "220 #{server_name_and_version}"
loop do
begin
s = get_command
s = process_telnet_sequences(s)
syntax_error unless s =~ /^(\w+)(?: (.*))?$/
command, argument = $1.downcase, $2
unless valid_command?(command)
error "Syntax error, command unrecognized: #{s.chomp}", 500
end
command_sequence_checker.check command
execute_command command, argument
rescue FtpServerError => e
reply e.message_with_code
rescue => e
# Get the full back trace
puts "Error thread #{Thread.current.object_id}. Local error in processing. Error #{e.message}"
thread_count = 0
err_msg = ""
Thread.list.each do |t|
thread_count += 1
err_msg += "--- thread #{thread_count} of total #{Thread.list.size}. Thread.id #{t.object_id} backtrace begin \n"
# Lets see if we are able to pin down the culprit
# by collecting backtrace for all existing threads:
err_msg += t.backtrace.join("\n")
err_msg += "\n---thread #{thread_count} of total #{Thread.list.size}. Thread.id #{t.object_id} backtrace end \n"
end
puts "Full thread & object backtrace : - \n #{err_msg}"
reply "451 Requested action aborted. Local error in processing. Error #{e.message}"
config.exception_handler.call(e) unless config.exception_handler.nil?
end
end
rescue Errno::ECONNRESET, Errno::EPIPE
end
end
end
Hi, I wonder if you could please help as I'm encountering issues with this gem in a live project and I'm at a bit of a loss as to what is happening so if anyone can provide any help it would be greately appreciated!
Overview
The FTP integration is used by a number of old weather stations that connect via FTP, over a mobile network, and write 1 line of text every minute. I'm simply using an override for the write_file() function that looks at the filename and then passed the line to be written on to be written to the database. This was working just fine for months until recently when I started to see 'Error Connection reset by peer @ io_fillbuf - fd:{number}' errors. Once these start to appear in the logs, slowly all of the connections start to generate this error until the FTP server refuses to accept connections and the only way that seems to fix it so far is to restart the application, where it will run for a apparently random amount of time from 7 hours to over a day before this starts to happen again, requiring another restart.
System
What's unusual and did anything change?
What are the settings?
It may be worth mentioning that this is running in a docker environment with mapped ports
The file system uses the default.
Initializers
Main execution block
Driver Code
Modification to get full all thread stack trace
CommandLoop modified with this function
Working example weather station transfer
Note: Above is the most common scenario in that I don't always see the extra quit line...
...This is why I set the session timeout low in case connections aren't being closed.
Example of the odd one out
This connection is not made by the weather stations but by a 3rd party FTP request for their own purposes
Full stack trace on io_fillbuf error including all threads
Included log from the start of thread id 2753380 : -
20220922_full_stack_trace.txt
The text was updated successfully, but these errors were encountered: