Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File descriptors not being released when File IO tracing feature is enabled #375

Closed
ntoskrnl opened this issue Sep 14, 2022 · 8 comments
Closed
Assignees
Labels
bug Something isn't working Platform: Android

Comments

@ntoskrnl
Copy link

Gradle Version

7.3.3

AGP Version

7.1.0

Code Minifier/Optimizer

No response

Version

3.1.5

Sentry SDK Version

6.4.1

Steps to Reproduce

Apply Gradle plugin to project with default configuration.

In a loop, create a file, write to file, close file and delete it.
Run lsof command and check if file descriptors are released.

Expected Result

My expectation is that file descriptors should be released if the app doesn't use corresponding files and closes them correctly.

Tracing File IO feature causes file descriptors leak and eventually crashes the app. If this is expected behaviour, this should be documented.

Actual Result

File descriptors are still shown in the output of losof. If app creates files, after a while, it crashes with various errors, but the root cause is "To many open files".

@ntoskrnl ntoskrnl added bug Something isn't working Platform: Android labels Sep 14, 2022
@adinauer
Copy link
Member

Hello @ntoskrnl thanks for reporting. We'll take a look. I assume you have already disabled file instrumentation for now, correct?

@adinauer adinauer self-assigned this Sep 16, 2022
@adinauer
Copy link
Member

@ntoskrnl can you please confirm some assumptions:

  • You are using an emulator
  • You then run lsof on your host OS like lsof -p 6803 to get only results for the emulator process

I am able to reproduce the crash but I can't see the problem using lsof.
Could you please share the output of your lsof and the exact command you run?

@adinauer
Copy link
Member

Oh and can you please also share the code you're using to reproduce just so we can verify it also fixes your problem once we come up with a solution.

@ntoskrnl
Copy link
Author

ntoskrnl commented Sep 17, 2022

To test it on real device with ADB, it should be rooted. lsof -p will only work with root permissions.

I am testing on both emulator and phone, and run lsof (without arguments) from within the app (using Runtime.exec()). Or on rooted device from root adb shell (with -p argument).

Here are some helper classes to reproduce the issue:

  1. Helper to run shell command from within the app.
object Shell {
    private const val TAG = "Shell"

    fun shellOutput(command: String?): String {
        var process: Process? = null
        var bufferedReader: BufferedReader? = null
        return try {
            process = Runtime.getRuntime().exec(command)
            bufferedReader = BufferedReader(
                InputStreamReader(process.inputStream)
            )
            val log = StringBuilder()
            var line: String?
            log.append(String.format("Executing a command '%s'", command)).append("\n")
            while (bufferedReader.readLine().also { line = it } != null) {
                log.append(line).append("\n")
            }
            log.toString()
        } catch (e: IOException) {
            Log.e(TAG, "logcat extraction error", e)
            ""
        } finally {
            if (bufferedReader != null) {
                try {
                    bufferedReader.close()
                } catch (e: IOException) {
                    Log.e(TAG, "logcat reading error", e)
                }
            }
            process?.destroy()
        }
    }
}
  1. MainActivity that runs the test
class MainActivity : AppCompatActivity() {

    private var job: Job? = null

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
    }
    
    override fun onStart() {
        super.onStart()

        job = GlobalScope.launch(Dispatchers.IO) {
            launch {
                fileWriterTest()
            }

            launch {
                fdMonitor()
            }
        }
    }

    override fun onStop() {
        job?.cancel()
        job = null
        super.onStop()
    }

    private suspend fun fileWriterTest() {
        val cacheDir = cacheDir
        while (true) {
            runInterruptible {
                val file = File(cacheDir, "${FILE_NAME_PREFIX}${System.currentTimeMillis()}.bin")
                FileOutputStream(file, false).use { outputStream ->
                    outputStream.write("Hello world".toByteArray())
                    outputStream.flush()
                }
                Thread.sleep(499)
                file.delete()
            }
            delay(1)
        }
    }

    private suspend fun fdMonitor() {
        while (true) {
            val allLines = runInterruptible { Shell.shellOutput("lsof").lines() }
            val fdCount = allLines.size
            val deletedCount = allLines.count { "(deleted)" in it }
            val testFileCount = allLines.count { FILE_NAME_PREFIX in it }
            val testDeletedCount = allLines.count { FILE_NAME_PREFIX in it && "(deleted)" in it }
            Timber.tag("FdMonitor").v("%d total files (deleted %d), %d test files (deleted %d)", fdCount, deletedCount, testFileCount, testDeletedCount)
            delay(5000)
        }
    }

    companion object {
        private const val FILE_NAME_PREFIX = "test-file-"
    }
}

Here is the output that I get (filter by "FdMonitor"):

2022-09-17 14:18:56.327 26095-26139/com.example.test V/FdMonitor: 895 total files (deleted 342), 1 test files (deleted 0)
2022-09-17 14:19:01.833 26095-26140/com.example.test V/FdMonitor: 930 total files (deleted 362), 13 test files (deleted 12)
2022-09-17 14:19:07.330 26095-26145/com.example.test V/FdMonitor: 940 total files (deleted 372), 24 test files (deleted 23)
2022-09-17 14:19:12.781 26095-26140/com.example.test V/FdMonitor: 950 total files (deleted 383), 34 test files (deleted 34)
2022-09-17 14:19:18.888 26095-26145/com.example.test V/FdMonitor: 966 total files (deleted 392), 40 test files (deleted 38)
2022-09-17 14:19:24.341 26095-26143/com.example.test V/FdMonitor: 976 total files (deleted 404), 51 test files (deleted 49)
2022-09-17 14:19:29.649 26095-26143/com.example.test V/FdMonitor: 981 total files (deleted 409), 61 test files (deleted 59)
2022-09-17 14:19:35.127 26095-26145/com.example.test V/FdMonitor: 1000 total files (deleted 428), 72 test files (deleted 70)
2022-09-17 14:19:40.601 26095-26145/com.example.test V/FdMonitor: 1007 total files (deleted 435), 82 test files (deleted 80)
2022-09-17 14:19:46.062 26095-26145/com.example.test V/FdMonitor: 1017 total files (deleted 445), 93 test files (deleted 91)
2022-09-17 14:19:51.534 26095-26145/com.example.test V/FdMonitor: 1028 total files (deleted 456), 104 test files (deleted 102)
2022-09-17 14:19:56.999 26095-26145/com.example.test V/FdMonitor: 1038 total files (deleted 466), 114 test files (deleted 112)
2022-09-17 14:20:02.485 26095-26145/com.example.test V/FdMonitor: 1049 total files (deleted 477), 125 test files (deleted 123)
2022-09-17 14:20:07.951 26095-26145/com.example.test V/FdMonitor: 1059 total files (deleted 487), 136 test files (deleted 134)
2022-09-17 14:20:13.430 26095-26145/com.example.test V/FdMonitor: 1071 total files (deleted 499), 146 test files (deleted 144)
2022-09-17 14:20:18.924 26095-26145/com.example.test V/FdMonitor: 1082 total files (deleted 510), 157 test files (deleted 155)
2022-09-17 14:20:24.417 26095-26145/com.example.test V/FdMonitor: 1092 total files (deleted 520), 168 test files (deleted 166)
2022-09-17 14:20:29.927 26095-26145/com.example.test V/FdMonitor: 1097 total files (deleted 525), 179 test files (deleted 177)

To reproduce crash faster, I also reduce the max open file limit via native call to setrlimit.

@ntoskrnl
Copy link
Author

It's actually not issue of Gradle plugin (the plugin and byte-code transformation works). But the problem with SentryFileOutputStream (sentry-java).

@ntoskrnl
Copy link
Author

ntoskrnl commented Sep 17, 2022

The problem is simple: when SentryFileOutputStream it opens 2 file descriptors instead of 1.

When we call constuctor:

public SentryFileOutputStream(final @Nullable File file, final boolean append)
    throws FileNotFoundException {
  this(init(file, append, null, HubAdapter.getInstance()));
}

the init method will create FileOutputStreamInitData with a new FileInputStream:

delegate = new FileOutputStream(file);

This is file descriptor 1.

Then we call private constructor

private SentryFileOutputStream(final @NotNull FileOutputStreamInitData data)
    throws FileNotFoundException {
  super(data.file, data.append);
  spanManager = new FileIOSpanManager(data.span, data.file, data.isSendDefaultPii);
  delegate = data.delegate;
}

which calls super(data.file, data.append) - it will open another new file descriptor.

Thus, there will be 2 FDs: One in the delegate and another inside SentryFileOutputStream itself.

Closing SentryFileOutputStream.close() will close the delegate's descriptor, but will because there is not call to super.close(), it won't close SentryFileOutputStream's descriptor.

Maybe here in the private constructor it should call super(delegate.fd) to prevent opening a new descriptor?

@adinauer
Copy link
Member

Thanks @ntoskrnl for the PR and the infos here, will do some testing on your changes today then hopefully merge and release soon. I've come up with a different approach which I wanted to improve before releasing but I like yours better, so going with that instead.

@adinauer
Copy link
Member

Version 6.4.2 of the Java SDK and 3.1.6 of the Sentry Android Gradle Plugin have been released containing a fix for this bug. Thanks again @ntoskrnl for the help 🙏 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Platform: Android
Projects
Archived in project
Development

No branches or pull requests

2 participants