Skip to content

Integrate Line Profiler in Codeflash CF-470 #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 53 commits into from
Apr 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
0e75144
todo
aseembits93 Mar 5, 2025
a9f6196
boilerplate ready, need to fill in the gaps
aseembits93 Mar 6, 2025
9ebbd61
wip
aseembits93 Mar 6, 2025
371ae4c
Merge remote-tracking branch 'origin/main' into line-profiler
aseembits93 Mar 10, 2025
f493836
avoid instrument_codeflash_capture
aseembits93 Mar 11, 2025
0220ee6
working mvp, need to parse though
aseembits93 Mar 11, 2025
339362f
undo coverage util modification
aseembits93 Mar 11, 2025
2f5baa9
cleaning up
aseembits93 Mar 11, 2025
de43f6b
Merge remote-tracking branch 'origin/main' into line-profiler
aseembits93 Mar 11, 2025
d171ba7
adding some runconfigs
aseembits93 Mar 12, 2025
8646196
line profiler results are saved in a temp file, need to pass to ai se…
aseembits93 Mar 12, 2025
51b8e27
Merge remote-tracking branch 'origin/main' into line-profiler
aseembits93 Mar 12, 2025
4247780
working demo of new opt candidates with lineprof info
aseembits93 Mar 15, 2025
52715ba
Merge remote-tracking branch 'origin/main' into line-profiler
aseembits93 Mar 15, 2025
ca0540e
removing env files
aseembits93 Mar 16, 2025
0a78e7e
concurrent execution of new optim candidates, readonly context testing
aseembits93 Mar 17, 2025
93623cb
still debugging readonly context code
aseembits93 Mar 17, 2025
102de11
works, need to follow type hints
aseembits93 Mar 19, 2025
e292831
Merge remote-tracking branch 'origin/main' into line-profiler
aseembits93 Mar 19, 2025
040a747
exception handling when lprof fails
aseembits93 Mar 19, 2025
7a413f6
feedback, cleanup
aseembits93 Mar 21, 2025
127f466
todo, improve testing
aseembits93 Mar 21, 2025
e224234
cleaning
aseembits93 Mar 21, 2025
3b9bee0
redo newline
aseembits93 Mar 21, 2025
e6a9066
better test, improve testing
aseembits93 Mar 22, 2025
c805624
Merge remote-tracking branch 'origin/main' into line-profiler
aseembits93 Mar 24, 2025
0b77d2a
wip concurrent optimization loop
aseembits93 Mar 24, 2025
81a6b78
Merge remote-tracking branch 'origin/main' into line-profiler
aseembits93 Mar 25, 2025
89e72b7
refactored to make a new category for line profiler tests
aseembits93 Mar 26, 2025
c00e324
works for any level of nested function
aseembits93 Mar 26, 2025
7ffe25f
works, optimization list length will be incorrectly displayed as we a…
aseembits93 Mar 29, 2025
d24d24c
fix for indentation in line profiler output
aseembits93 Mar 29, 2025
82f4138
cleaning up
aseembits93 Mar 29, 2025
29f4c1a
wrote some tests for instrumentation+testrun+parsing, todo, write som…
aseembits93 Mar 29, 2025
862eae8
Merge branch 'main' into line-profiler
aseembits93 Mar 31, 2025
ee90a08
putting file restore in the finally block
aseembits93 Mar 31, 2025
e17106f
putting file restore in the finally block and also handling failures …
aseembits93 Mar 31, 2025
b3e38f2
quick fix for merge with main
aseembits93 Mar 31, 2025
4af8442
minor fixes
aseembits93 Mar 31, 2025
d8246fc
pathlib for r/w
aseembits93 Mar 31, 2025
97c3cac
set up empty line profile results if there's an error
aseembits93 Mar 31, 2025
6418e7d
better exception handling for line profiler
aseembits93 Mar 31, 2025
275b88b
better type hinting
aseembits93 Mar 31, 2025
4b20465
add line_profiler as a requirement in pyproject toml
aseembits93 Mar 31, 2025
00abefb
more tests
aseembits93 Apr 1, 2025
0ab9dbf
import order issue
aseembits93 Apr 1, 2025
7e764ec
Merge branch 'main' into line-profiler
misrasaurabh1 Apr 1, 2025
a628dae
formatting changes
aseembits93 Apr 1, 2025
2abbfba
moving line profiler instrument tests in a separate file
aseembits93 Apr 1, 2025
b8614e0
tests galore
aseembits93 Apr 1, 2025
ab06721
Merge remote-tracking branch 'origin/main' into line-profiler
aseembits93 Apr 1, 2025
b269f87
nested classes handled now
aseembits93 Apr 1, 2025
3cab8ea
change max workers to one
aseembits93 Apr 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions code_to_optimize/bubble_sort_classmethod.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from code_to_optimize.bubble_sort_in_class import BubbleSortClass


def sort_classmethod(x):
y = BubbleSortClass()
return y.sorter(x)
6 changes: 6 additions & 0 deletions code_to_optimize/bubble_sort_nested_classmethod.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from code_to_optimize.bubble_sort_in_nested_class import WrapperClass


def sort_classmethod(x):
y = WrapperClass.BubbleSortClass()
return y.sorter(x)
70 changes: 70 additions & 0 deletions codeflash/api/aiservice.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,76 @@ def optimize_python_code(
console.rule()
return []

def optimize_python_code_line_profiler(
self,
source_code: str,
dependency_code: str,
trace_id: str,
line_profiler_results: str,
num_candidates: int = 10,
experiment_metadata: ExperimentMetadata | None = None,
) -> list[OptimizedCandidate]:
"""Optimize the given python code for performance by making a request to the Django endpoint.

Parameters
----------
- source_code (str): The python code to optimize.
- dependency_code (str): The dependency code used as read-only context for the optimization
- trace_id (str): Trace id of optimization run
- num_candidates (int): Number of optimization variants to generate. Default is 10.
- experiment_metadata (Optional[ExperimentalMetadata, None]): Any available experiment metadata for this optimization

Returns
-------
- List[OptimizationCandidate]: A list of Optimization Candidates.

"""
payload = {
"source_code": source_code,
"dependency_code": dependency_code,
"num_variants": num_candidates,
"line_profiler_results": line_profiler_results,
"trace_id": trace_id,
"python_version": platform.python_version(),
"experiment_metadata": experiment_metadata,
"codeflash_version": codeflash_version,
}

logger.info("Generating optimized candidates…")
console.rule()
if line_profiler_results=="":
logger.info("No LineProfiler results were provided, Skipping optimization.")
console.rule()
return []
try:
response = self.make_ai_service_request("/optimize-line-profiler", payload=payload, timeout=600)
except requests.exceptions.RequestException as e:
logger.exception(f"Error generating optimized candidates: {e}")
ph("cli-optimize-error-caught", {"error": str(e)})
return []

if response.status_code == 200:
optimizations_json = response.json()["optimizations"]
logger.info(f"Generated {len(optimizations_json)} candidates.")
console.rule()
return [
OptimizedCandidate(
source_code=opt["source_code"],
explanation=opt["explanation"],
optimization_id=opt["optimization_id"],
)
for opt in optimizations_json
]
try:
error = response.json()["error"]
except Exception:
error = response.text
logger.error(f"Error generating optimized candidates: {response.status_code} - {error}")
ph("cli-optimize-error-response", {"response_status_code": response.status_code, "error": error})
console.rule()
return []


def log_results(
self,
function_trace_id: str,
Expand Down
223 changes: 223 additions & 0 deletions codeflash/code_utils/line_profile_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
"""Adapted from line_profiler (https://github.com/pyutils/line_profiler) written by Enthought, Inc. (BSD License)"""
from collections import defaultdict
from pathlib import Path
from typing import Union

import isort
import libcst as cst

from codeflash.code_utils.code_utils import get_run_tmp_file


class LineProfilerDecoratorAdder(cst.CSTTransformer):
"""Transformer that adds a decorator to a function with a specific qualified name."""

#TODO we don't support nested functions yet so they can only be inside classes, dont use qualified names, instead use the structure
def __init__(self, qualified_name: str, decorator_name: str):
"""Initialize the transformer.

Args:
qualified_name: The fully qualified name of the function to add the decorator to (e.g., "MyClass.nested_func.target_func").
decorator_name: The name of the decorator to add.

"""
super().__init__()
self.qualified_name_parts = qualified_name.split(".")
self.decorator_name = decorator_name

# Track our current context path, only add when we encounter a class
self.context_stack = []

def visit_ClassDef(self, node: cst.ClassDef) -> None:
# Track when we enter a class
self.context_stack.append(node.name.value)

def leave_ClassDef(self, original_node: cst.ClassDef, updated_node: cst.ClassDef) -> cst.ClassDef:
# Pop the context when we leave a class
self.context_stack.pop()
return updated_node

def visit_FunctionDef(self, node: cst.FunctionDef) -> None:
# Track when we enter a function
self.context_stack.append(node.name.value)

def leave_FunctionDef(self, original_node: cst.FunctionDef, updated_node: cst.FunctionDef) -> cst.FunctionDef:
function_name = original_node.name.value

# Check if the current context path matches our target qualified name
if self.context_stack==self.qualified_name_parts:
# Check if the decorator is already present
has_decorator = any(
self._is_target_decorator(decorator.decorator)
for decorator in original_node.decorators
)

# Only add the decorator if it's not already there
if not has_decorator:
new_decorator = cst.Decorator(
decorator=cst.Name(value=self.decorator_name)
)

# Add our new decorator to the existing decorators
updated_decorators = [new_decorator] + list(updated_node.decorators)
updated_node = updated_node.with_changes(
decorators=tuple(updated_decorators)
)

# Pop the context when we leave a function
self.context_stack.pop()
return updated_node

def _is_target_decorator(self, decorator_node: Union[cst.Name, cst.Attribute, cst.Call]) -> bool:
"""Check if a decorator matches our target decorator name."""
if isinstance(decorator_node, cst.Name):
return decorator_node.value == self.decorator_name
if isinstance(decorator_node, cst.Call) and isinstance(decorator_node.func, cst.Name):
return decorator_node.func.value == self.decorator_name
return False

class ProfileEnableTransformer(cst.CSTTransformer):
def __init__(self,filename):
# Flag to track if we found the import statement
self.found_import = False
# Track indentation of the import statement
self.import_indentation = None
self.filename = filename

def leave_ImportFrom(self, original_node: cst.ImportFrom, updated_node: cst.ImportFrom) -> cst.ImportFrom:
# Check if this is the line profiler import statement
if (isinstance(original_node.module, cst.Name) and
original_node.module.value == "line_profiler" and
any(name.name.value == "profile" and
(not name.asname or name.asname.name.value == "codeflash_line_profile")
for name in original_node.names)):

self.found_import = True
# Get the indentation from the original node
if hasattr(original_node, "leading_lines"):
leading_whitespace = original_node.leading_lines[-1].whitespace if original_node.leading_lines else ""
self.import_indentation = leading_whitespace

return updated_node

def leave_Module(self, original_node: cst.Module, updated_node: cst.Module) -> cst.Module:
if not self.found_import:
return updated_node

# Create a list of statements from the original module
new_body = list(updated_node.body)

# Find the index of the import statement
import_index = None
for i, stmt in enumerate(new_body):
if isinstance(stmt, cst.SimpleStatementLine):
for small_stmt in stmt.body:
if isinstance(small_stmt, cst.ImportFrom):
if (isinstance(small_stmt.module, cst.Name) and
small_stmt.module.value == "line_profiler" and
any(name.name.value == "profile" and
(not name.asname or name.asname.name.value == "codeflash_line_profile")
for name in small_stmt.names)):
import_index = i
break
if import_index is not None:
break

if import_index is not None:
# Create the new enable statement to insert after the import
enable_statement = cst.parse_statement(
f"codeflash_line_profile.enable(output_prefix='{self.filename}')"
)

# Insert the new statement after the import statement
new_body.insert(import_index + 1, enable_statement)

# Create a new module with the updated body
return updated_node.with_changes(body=new_body)

def add_decorator_to_qualified_function(module, qualified_name, decorator_name):
"""Add a decorator to a function with the exact qualified name in the source code.

Args:
module: The Python source code as a string.
qualified_name: The fully qualified name of the function to add the decorator to (e.g., "MyClass.nested_func.target_func").
decorator_name: The name of the decorator to add.

Returns:
The modified source code as a string.

"""
# Parse the source code into a CST

# Apply our transformer
transformer = LineProfilerDecoratorAdder(qualified_name, decorator_name)
modified_module = module.visit(transformer)

# Convert the modified CST back to source code
return modified_module

def add_profile_enable(original_code: str, line_profile_output_file: str) -> str:
# TODO modify by using a libcst transformer
module = cst.parse_module(original_code)
transformer = ProfileEnableTransformer(line_profile_output_file)
modified_module = module.visit(transformer)
return modified_module.code


class ImportAdder(cst.CSTTransformer):
def __init__(self, import_statement):
self.import_statement = import_statement
self.has_import = False

def leave_Module(self, original_node, updated_node):
# If the import is already there, don't add it again
if self.has_import:
return updated_node

# Parse the import statement into a CST node
import_node = cst.parse_statement(self.import_statement)

# Add the import to the module's body
return updated_node.with_changes(
body=[import_node] + list(updated_node.body)
)

def visit_ImportFrom(self, node):
# Check if the profile is already imported from line_profiler
if node.module and node.module.value == "line_profiler":
for import_alias in node.names:
if import_alias.name.value == "profile":
self.has_import = True


def add_decorator_imports(function_to_optimize, code_context):
"""Adds a profile decorator to a function in a Python file and all its helper functions."""
#self.function_to_optimize, file_path_to_helper_classes, self.test_cfg.tests_root
#grouped iteration, file to fns to optimize, from line_profiler import profile as codeflash_line_profile
file_paths = defaultdict(list)
line_profile_output_file = get_run_tmp_file(Path("baseline_lprof"))
file_paths[function_to_optimize.file_path].append(function_to_optimize.qualified_name)
for elem in code_context.helper_functions:
file_paths[elem.file_path].append(elem.qualified_name)
for file_path,fns_present in file_paths.items():
#open file
file_contents = file_path.read_text("utf-8")
# parse to cst
module_node = cst.parse_module(file_contents)
for fn_name in fns_present:
# add decorator
module_node = add_decorator_to_qualified_function(module_node, fn_name, "codeflash_line_profile")
# add imports
# Create a transformer to add the import
transformer = ImportAdder("from line_profiler import profile as codeflash_line_profile")
# Apply the transformer to add the import
module_node = module_node.visit(transformer)
modified_code = isort.code(module_node.code, float_to_top=True)
# write to file
with open(file_path, "w", encoding="utf-8") as file:
file.write(modified_code)
#Adding profile.enable line for changing the savepath of the data, do this only for the main file and not the helper files
file_contents = function_to_optimize.file_path.read_text("utf-8")
modified_code = add_profile_enable(file_contents,str(line_profile_output_file))
function_to_optimize.file_path.write_text(modified_code,"utf-8")
return line_profile_output_file
Loading
Loading