Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-profgen] Remove temporary perf script files #86668

Merged
merged 4 commits into from
Apr 11, 2024

Conversation

HaohaiWen
Copy link
Contributor

The temporary perf script files converted from perf data will occupy lots
of space for large project. This patch removes them when llvm-profgen
exits normally or receives signals.

The temporary perf script files converted from perf data will occupy lots
of space for large project. This patch removes them when llvm-profgen
exits normally or receives signals.
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Mar 26, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 26, 2024

@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-pgo

Author: Haohai Wen (HaohaiWen)

Changes

The temporary perf script files converted from perf data will occupy lots
of space for large project. This patch removes them when llvm-profgen
exits normally or receives signals.


Full diff: https://github.com/llvm/llvm-project/pull/86668.diff

3 Files Affected:

  • (modified) llvm/tools/llvm-profgen/PerfReader.cpp (+19)
  • (modified) llvm/tools/llvm-profgen/PerfReader.h (+7)
  • (modified) llvm/tools/llvm-profgen/llvm-profgen.cpp (+2)
diff --git a/llvm/tools/llvm-profgen/PerfReader.cpp b/llvm/tools/llvm-profgen/PerfReader.cpp
index 878147642aa6e7..5ed0f1333cf2eb 100644
--- a/llvm/tools/llvm-profgen/PerfReader.cpp
+++ b/llvm/tools/llvm-profgen/PerfReader.cpp
@@ -11,6 +11,7 @@
 #include "llvm/DebugInfo/Symbolize/SymbolizableModule.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Process.h"
+#include "llvm/Support/Signals.h"
 
 #define DEBUG_TYPE "perf-reader"
 
@@ -375,6 +376,9 @@ PerfScriptReader::convertPerfDataToTrace(ProfiledBinary *Binary,
                                           StringRef(ErrorFile)};    // Stderr
   sys::ExecuteAndWait(PerfPath, ScriptMMapArgs, std::nullopt, Redirects);
 
+  PerfScriptReader::markTempFile(PerfTraceFile);
+  PerfScriptReader::markTempFile(ErrorFile);
+
   // Collect the PIDs
   TraceStream TraceIt(PerfTraceFile);
   std::string PIDs;
@@ -992,6 +996,11 @@ bool PerfScriptReader::extractMMap2EventForBinary(ProfiledBinary *Binary,
   return Binary->getName() == BinaryName;
 }
 
+void PerfScriptReader::markTempFile(StringRef FileName) {
+  sys::RemoveFileOnSignal(FileName);
+  TempFiles.push_back(FileName.str());
+}
+
 void PerfScriptReader::parseMMap2Event(TraceStream &TraceIt) {
   MMapEvent MMap;
   if (extractMMap2EventForBinary(Binary, TraceIt.getCurrentLine(), MMap))
@@ -1081,6 +1090,14 @@ PerfContent PerfScriptReader::checkPerfScriptType(StringRef FileName) {
   return PerfContent::UnknownContent;
 }
 
+void PerfScriptReader::removeTempFiles() {
+  for (StringRef FileName : TempFiles) {
+    sys::fs::remove(FileName);
+    sys::DontRemoveFileOnSignal(FileName);
+  }
+  TempFiles.clear();
+}
+
 void HybridPerfReader::generateUnsymbolizedProfile() {
   ProfileIsCS = !IgnoreStackSamples;
   if (ProfileIsCS)
@@ -1220,5 +1237,7 @@ void PerfScriptReader::parsePerfTraces() {
     writeUnsymbolizedProfile(OutputFilename);
 }
 
+SmallVector<std::string, 2> PerfScriptReader::TempFiles;
+
 } // end namespace sampleprof
 } // end namespace llvm
diff --git a/llvm/tools/llvm-profgen/PerfReader.h b/llvm/tools/llvm-profgen/PerfReader.h
index e9f619350bf970..6401a4c51c8b10 100644
--- a/llvm/tools/llvm-profgen/PerfReader.h
+++ b/llvm/tools/llvm-profgen/PerfReader.h
@@ -603,6 +603,8 @@ class PerfScriptReader : public PerfReaderBase {
                          std::optional<uint32_t> PIDFilter);
   // Extract perf script type by peaking at the input
   static PerfContent checkPerfScriptType(StringRef FileName);
+  // Remove all temporary files.
+  static void removeTempFiles();
 
 protected:
   // The parsed MMap event
@@ -622,6 +624,8 @@ class PerfScriptReader : public PerfReaderBase {
   // mapping between the binary name and its memory layout.
   static bool extractMMap2EventForBinary(ProfiledBinary *Binary, StringRef Line,
                                          MMapEvent &MMap);
+  // Mark temporary file for future clean up.
+  static void markTempFile(StringRef FileName);
   // Update base address based on mmap events
   void updateBinaryAddress(const MMapEvent &Event);
   // Parse mmap event and update binary address
@@ -662,6 +666,9 @@ class PerfScriptReader : public PerfReaderBase {
   std::set<uint64_t> InvalidReturnAddresses;
   // PID for the process of interest
   std::optional<uint32_t> PIDFilter;
+
+  // Temporary files created by perf script command.
+  static SmallVector<std::string, 2> TempFiles;
 };
 
 /*
diff --git a/llvm/tools/llvm-profgen/llvm-profgen.cpp b/llvm/tools/llvm-profgen/llvm-profgen.cpp
index 3b974e25103ad4..d0ec463f717677 100644
--- a/llvm/tools/llvm-profgen/llvm-profgen.cpp
+++ b/llvm/tools/llvm-profgen/llvm-profgen.cpp
@@ -189,5 +189,7 @@ int main(int argc, const char *argv[]) {
     Generator->write();
   }
 
+  PerfScriptReader::removeTempFiles();
+
   return EXIT_SUCCESS;
 }

@HaohaiWen
Copy link
Contributor Author

Fix unresolved issue in #83489

Copy link
Contributor

@MatzeB MatzeB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

Copy link
Member

@WenleiHe WenleiHe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nit.

Comment on lines 379 to 380
PerfScriptReader::markTempFile(PerfTraceFile);
PerfScriptReader::markTempFile(ErrorFile);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would slightly prefer we do "createTempFile" instead of "createFile" + "MarkTemp". no strong opinion though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think sys::fs::createTemporaryFile has any of the cleanup logic from MarkTemp. There is the TempFile class but as far as I can tell that is meant towards the use case of getting a C++ file descriptor that you can write to but doesn't work to give you a filename + later cleanup; you only get a filename when you explicitely do not want the cleanup to happen.

So while I looked a bit when reviewing, I don't think there is much better than can be done here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to suggest combing "createUniquePath" and "markTempFile" into a single helper, so people won't create a temp file and forget to mark it as temp..

@MaskRay
Copy link
Member

MaskRay commented Mar 26, 2024

Thx. This mechanism is similar to "llvm/Support/ToolOutputFile.h"

@@ -189,5 +189,7 @@ int main(int argc, const char *argv[]) {
Generator->write();
}

PerfScriptReader::removeTempFiles();

return EXIT_SUCCESS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: there are more "return EXIT_SUCCESS;" in other places, may be useful to also cover them(like use/call a wrapper function)? though other usages are not commonly used..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using CleanupInstaller can avoid this since cleanup will be auto run in destructor of static variable.

This can be used by others to automatically remove temp files.
@HaohaiWen
Copy link
Contributor Author

Thx. This mechanism is similar to "llvm/Support/ToolOutputFile.h"

Thanks for letting me know. This make the code much cleaner. I filed a patch to make CleanupInstaller public so that it can be used here. #86758

@HaohaiWen HaohaiWen merged commit 3f7f446 into llvm:main Apr 11, 2024
4 checks passed
@HaohaiWen HaohaiWen deleted the profgen branch April 11, 2024 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:support PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants