-
Notifications
You must be signed in to change notification settings - Fork 469
feat(llmobs): [MLOB-4258] add support for OpenAI server-side MCP calls #15057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 216 ± 3 ms. The average import time from base is: 222 ± 3 ms. The import time difference between this PR and base is: -5.3 ± 0.1 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate nicole-cybul/openai-mcp-support (ef54ef9) with baseline main (6f04282) 📈 Performance Regressions (2 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 0.405µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -0.4% Memory: ✅ 39.892MB (SLO: <41.500MB -3.9%) vs baseline: +6.0% ✅ add_inplace_aspectTime: ✅ 0.407µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.4% Memory: ✅ 39.381MB (SLO: <41.500MB -5.1%) vs baseline: +4.8% ✅ add_inplace_noaspectTime: ✅ 0.314µs (SLO: <10.000µs 📉 -96.9%) vs baseline: -3.2% Memory: ✅ 39.695MB (SLO: <41.500MB -4.3%) vs baseline: +5.6% ✅ add_noaspectTime: ✅ 0.279µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +0.1% Memory: ✅ 39.420MB (SLO: <41.500MB -5.0%) vs baseline: +4.1% ✅ bytearray_aspectTime: ✅ 1.398µs (SLO: <10.000µs 📉 -86.0%) vs baseline: +6.0% Memory: ✅ 39.852MB (SLO: <41.500MB -4.0%) vs baseline: +5.4% ✅ bytearray_extend_aspectTime: ✅ 1.538µs (SLO: <10.000µs 📉 -84.6%) vs baseline: +0.2% Memory: ✅ 39.990MB (SLO: <41.500MB -3.6%) vs baseline: +6.0% ✅ bytearray_extend_noaspectTime: ✅ 0.609µs (SLO: <10.000µs 📉 -93.9%) vs baseline: -1.3% Memory: ✅ 39.479MB (SLO: <41.500MB -4.9%) vs baseline: +5.0% ✅ bytearray_noaspectTime: ✅ 0.482µs (SLO: <10.000µs 📉 -95.2%) vs baseline: +0.2% Memory: ✅ 39.675MB (SLO: <41.500MB -4.4%) vs baseline: +5.3% ✅ bytes_aspectTime: ✅ 1.297µs (SLO: <10.000µs 📉 -87.0%) vs baseline: +0.5% Memory: ✅ 39.833MB (SLO: <41.500MB -4.0%) vs baseline: +6.0% ✅ bytes_noaspectTime: ✅ 0.495µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.4% Memory: ✅ 39.813MB (SLO: <41.500MB -4.1%) vs baseline: +4.9% ✅ bytesio_aspectTime: ✅ 1.305µs (SLO: <10.000µs 📉 -86.9%) vs baseline: -0.3% Memory: ✅ 39.892MB (SLO: <41.500MB -3.9%) vs baseline: +5.3% ✅ bytesio_noaspectTime: ✅ 0.501µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.3% Memory: ✅ 39.852MB (SLO: <41.500MB -4.0%) vs baseline: +4.7% ✅ capitalize_aspectTime: ✅ 0.736µs (SLO: <10.000µs 📉 -92.6%) vs baseline: -0.8% Memory: ✅ 39.499MB (SLO: <41.500MB -4.8%) vs baseline: +4.0% ✅ capitalize_noaspectTime: ✅ 0.434µs (SLO: <10.000µs 📉 -95.7%) vs baseline: ~same Memory: ✅ 39.518MB (SLO: <41.500MB -4.8%) vs baseline: +4.8% ✅ casefold_aspectTime: ✅ 0.746µs (SLO: <10.000µs 📉 -92.5%) vs baseline: +0.8% Memory: ✅ 39.892MB (SLO: <41.500MB -3.9%) vs baseline: +5.3% ✅ casefold_noaspectTime: ✅ 0.369µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -0.1% Memory: ✅ 39.852MB (SLO: <41.500MB -4.0%) vs baseline: +4.7% ✅ decode_aspectTime: ✅ 0.728µs (SLO: <10.000µs 📉 -92.7%) vs baseline: +0.9% Memory: ✅ 39.459MB (SLO: <41.500MB -4.9%) vs baseline: +4.9% ✅ decode_noaspectTime: ✅ 0.418µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.4% Memory: ✅ 39.793MB (SLO: <41.500MB -4.1%) vs baseline: +4.7% ✅ encode_aspectTime: ✅ 0.707µs (SLO: <10.000µs 📉 -92.9%) vs baseline: -0.8% Memory: ✅ 39.499MB (SLO: <41.500MB -4.8%) vs baseline: +5.2% ✅ encode_noaspectTime: ✅ 0.400µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -1.2% Memory: ✅ 39.852MB (SLO: <41.500MB -4.0%) vs baseline: +4.5% ✅ format_aspectTime: ✅ 3.298µs (SLO: <10.000µs 📉 -67.0%) vs baseline: -0.7% Memory: ✅ 39.499MB (SLO: <41.500MB -4.8%) vs baseline: +5.0% ✅ format_map_aspectTime: ✅ 3.470µs (SLO: <10.000µs 📉 -65.3%) vs baseline: -1.5% Memory: ✅ 39.479MB (SLO: <41.500MB -4.9%) vs baseline: +4.9% ✅ format_map_noaspectTime: ✅ 0.775µs (SLO: <10.000µs 📉 -92.2%) vs baseline: +0.1% Memory: ✅ 39.459MB (SLO: <41.500MB -4.9%) vs baseline: +5.0% ✅ format_noaspectTime: ✅ 0.595µs (SLO: <10.000µs 📉 -94.1%) vs baseline: ~same Memory: ✅ 39.813MB (SLO: <41.500MB -4.1%) vs baseline: +4.8% ✅ index_aspectTime: ✅ 0.359µs (SLO: <10.000µs 📉 -96.4%) vs baseline: +0.6% Memory: ✅ 39.833MB (SLO: <41.500MB -4.0%) vs baseline: +5.8% ✅ index_noaspectTime: ✅ 0.279µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +0.5% Memory: ✅ 39.852MB (SLO: <41.500MB -4.0%) vs baseline: +4.9% ✅ join_aspectTime: ✅ 1.378µs (SLO: <10.000µs 📉 -86.2%) vs baseline: -0.4% Memory: ✅ 39.872MB (SLO: <41.500MB -3.9%) vs baseline: +5.9% ✅ join_noaspectTime: ✅ 0.495µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.8% Memory: ✅ 39.440MB (SLO: <41.500MB -5.0%) vs baseline: +4.1% ✅ ljust_aspectTime: ✅ 2.589µs (SLO: <20.000µs 📉 -87.1%) vs baseline: +4.3% Memory: ✅ 39.695MB (SLO: <41.500MB -4.3%) vs baseline: +5.2% ✅ ljust_noaspectTime: ✅ 0.405µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -0.8% Memory: ✅ 39.872MB (SLO: <41.500MB -3.9%) vs baseline: +4.9% ✅ lower_aspectTime: ✅ 2.194µs (SLO: <10.000µs 📉 -78.1%) vs baseline: +0.5% Memory: ✅ 39.715MB (SLO: <41.500MB -4.3%) vs baseline: +5.3% ✅ lower_noaspectTime: ✅ 0.366µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -1.1% Memory: ✅ 39.911MB (SLO: <41.500MB -3.8%) vs baseline: +4.9% ✅ lstrip_aspectTime: ✅ 2.581µs (SLO: <20.000µs 📉 -87.1%) vs baseline: 📈 +17.1% Memory: ✅ 39.479MB (SLO: <41.500MB -4.9%) vs baseline: +4.9% ✅ lstrip_noaspectTime: ✅ 0.384µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +1.3% Memory: ✅ 39.872MB (SLO: <41.500MB -3.9%) vs baseline: +4.8% ✅ modulo_aspectTime: ✅ 1.037µs (SLO: <10.000µs 📉 -89.6%) vs baseline: -0.5% Memory: ✅ 39.440MB (SLO: <41.500MB -5.0%) vs baseline: +4.7% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.555µs (SLO: <10.000µs 📉 -84.5%) vs baseline: -0.2% Memory: ✅ 39.597MB (SLO: <41.500MB -4.6%) vs baseline: +5.3% ✅ modulo_aspect_for_bytesTime: ✅ 0.978µs (SLO: <10.000µs 📉 -90.2%) vs baseline: ~same Memory: ✅ 39.833MB (SLO: <41.500MB -4.0%) vs baseline: +4.9% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.255µs (SLO: <10.000µs 📉 -87.5%) vs baseline: +0.3% Memory: ✅ 39.459MB (SLO: <41.500MB -4.9%) vs baseline: +5.0% ✅ modulo_noaspectTime: ✅ 0.625µs (SLO: <10.000µs 📉 -93.7%) vs baseline: -0.4% Memory: ✅ 39.813MB (SLO: <41.500MB -4.1%) vs baseline: +4.5% ✅ replace_aspectTime: ✅ 4.866µs (SLO: <10.000µs 📉 -51.3%) vs baseline: -0.2% Memory: ✅ 39.440MB (SLO: <41.500MB -5.0%) vs baseline: +4.8% ✅ replace_noaspectTime: ✅ 0.458µs (SLO: <10.000µs 📉 -95.4%) vs baseline: -0.1% Memory: ✅ 39.813MB (SLO: <41.500MB -4.1%) vs baseline: +4.7% ✅ repr_aspectTime: ✅ 0.908µs (SLO: <10.000µs 📉 -90.9%) vs baseline: +0.6% Memory: ✅ 39.892MB (SLO: <41.500MB -3.9%) vs baseline: +6.2% ✅ repr_noaspectTime: ✅ 0.412µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -1.1% Memory: ✅ 39.400MB (SLO: <41.500MB -5.1%) vs baseline: +3.5% ✅ rstrip_aspectTime: ✅ 1.924µs (SLO: <20.000µs 📉 -90.4%) vs baseline: +2.1% Memory: ✅ 39.459MB (SLO: <41.500MB -4.9%) vs baseline: +5.0% ✅ rstrip_noaspectTime: ✅ 0.374µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -1.1% Memory: ✅ 39.911MB (SLO: <41.500MB -3.8%) vs baseline: +5.0% ✅ slice_aspectTime: ✅ 0.494µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.2% Memory: ✅ 39.970MB (SLO: <41.500MB -3.7%) vs baseline: +6.4% ✅ slice_noaspectTime: ✅ 0.447µs (SLO: <10.000µs 📉 -95.5%) vs baseline: +0.4% Memory: ✅ 39.793MB (SLO: <41.500MB -4.1%) vs baseline: +4.7% ✅ stringio_aspectTime: ✅ 1.529µs (SLO: <10.000µs 📉 -84.7%) vs baseline: -0.3% Memory: ✅ 39.499MB (SLO: <41.500MB -4.8%) vs baseline: +4.5% ✅ stringio_noaspectTime: ✅ 0.715µs (SLO: <10.000µs 📉 -92.9%) vs baseline: -1.4% Memory: ✅ 39.911MB (SLO: <41.500MB -3.8%) vs baseline: +4.9% ✅ strip_aspectTime: ✅ 2.223µs (SLO: <20.000µs 📉 -88.9%) vs baseline: +0.4% Memory: ✅ 39.813MB (SLO: <41.500MB -4.1%) vs baseline: +5.9% ✅ strip_noaspectTime: ✅ 0.383µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -0.5% Memory: ✅ 39.911MB (SLO: <41.500MB -3.8%) vs baseline: +5.1% ✅ swapcase_aspectTime: ✅ 2.399µs (SLO: <10.000µs 📉 -76.0%) vs baseline: +0.4% Memory: ✅ 39.459MB (SLO: <41.500MB -4.9%) vs baseline: +5.1% ✅ swapcase_noaspectTime: ✅ 0.536µs (SLO: <10.000µs 📉 -94.6%) vs baseline: -0.6% Memory: ✅ 39.911MB (SLO: <41.500MB -3.8%) vs baseline: +4.8% ✅ title_aspectTime: ✅ 2.347µs (SLO: <10.000µs 📉 -76.5%) vs baseline: +0.9% Memory: ✅ 39.774MB (SLO: <41.500MB -4.2%) vs baseline: +5.6% ✅ title_noaspectTime: ✅ 0.507µs (SLO: <10.000µs 📉 -94.9%) vs baseline: +0.9% Memory: ✅ 39.872MB (SLO: <41.500MB -3.9%) vs baseline: +4.9% ✅ translate_aspectTime: ✅ 3.216µs (SLO: <10.000µs 📉 -67.8%) vs baseline: +0.5% Memory: ✅ 39.420MB (SLO: <41.500MB -5.0%) vs baseline: +4.8% ✅ translate_noaspectTime: ✅ 1.045µs (SLO: <10.000µs 📉 -89.5%) vs baseline: ~same Memory: ✅ 39.577MB (SLO: <41.500MB -4.6%) vs baseline: +5.2% ✅ upper_aspectTime: ✅ 2.195µs (SLO: <10.000µs 📉 -78.0%) vs baseline: +0.4% Memory: ✅ 39.734MB (SLO: <41.500MB -4.3%) vs baseline: +5.6% ✅ upper_noaspectTime: ✅ 0.368µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.9% Memory: ✅ 39.911MB (SLO: <41.500MB -3.8%) vs baseline: +4.8% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 2.897µs (SLO: <20.000µs 📉 -85.5%) vs baseline: -2.0% Memory: ✅ 34.662MB (SLO: <35.500MB -2.4%) vs baseline: +5.1% ✅ 1-count-metrics-100-timesTime: ✅ 202.020µs (SLO: <220.000µs -8.2%) vs baseline: -1.7% Memory: ✅ 34.583MB (SLO: <35.500MB -2.6%) vs baseline: +4.7% ✅ 1-distribution-metric-1-timesTime: ✅ 3.338µs (SLO: <20.000µs 📉 -83.3%) vs baseline: +0.7% Memory: ✅ 34.701MB (SLO: <35.500MB -2.2%) vs baseline: +5.0% ✅ 1-distribution-metrics-100-timesTime: ✅ 214.938µs (SLO: <230.000µs -6.5%) vs baseline: -2.4% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +4.7% ✅ 1-gauge-metric-1-timesTime: ✅ 2.538µs (SLO: <20.000µs 📉 -87.3%) vs baseline: 📈 +16.5% Memory: ✅ 34.623MB (SLO: <35.500MB -2.5%) vs baseline: +4.7% ✅ 1-gauge-metrics-100-timesTime: ✅ 137.952µs (SLO: <150.000µs -8.0%) vs baseline: +0.3% Memory: ✅ 34.662MB (SLO: <35.500MB -2.4%) vs baseline: +5.0% ✅ 1-rate-metric-1-timesTime: ✅ 3.480µs (SLO: <20.000µs 📉 -82.6%) vs baseline: 📈 +12.1% Memory: ✅ 34.524MB (SLO: <35.500MB -2.7%) vs baseline: +4.7% ✅ 1-rate-metrics-100-timesTime: ✅ 214.241µs (SLO: <250.000µs 📉 -14.3%) vs baseline: -1.5% Memory: ✅ 34.505MB (SLO: <35.500MB -2.8%) vs baseline: +4.7% ✅ 100-count-metrics-100-timesTime: ✅ 20.399ms (SLO: <22.000ms -7.3%) vs baseline: -0.3% Memory: ✅ 34.623MB (SLO: <35.500MB -2.5%) vs baseline: +5.0% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.248ms (SLO: <2.300ms -2.3%) vs baseline: -0.3% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +4.5% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.431ms (SLO: <1.550ms -7.7%) vs baseline: +0.8% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +4.8% ✅ 100-rate-metrics-100-timesTime: ✅ 2.225ms (SLO: <2.550ms 📉 -12.8%) vs baseline: +0.6% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +4.7% ✅ flush-1-metricTime: ✅ 4.616µs (SLO: <20.000µs 📉 -76.9%) vs baseline: +4.8% Memory: ✅ 34.623MB (SLO: <35.500MB -2.5%) vs baseline: +5.0% ✅ flush-100-metricsTime: ✅ 172.744µs (SLO: <250.000µs 📉 -30.9%) vs baseline: -0.2% Memory: ✅ 34.662MB (SLO: <35.500MB -2.4%) vs baseline: +4.9% ✅ flush-1000-metricsTime: ✅ 2.122ms (SLO: <2.500ms 📉 -15.1%) vs baseline: -0.4% Memory: ✅ 35.468MB (SLO: <36.500MB -2.8%) vs baseline: +4.9% 🟡 Near SLO Breach (4 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 19.286ms (SLO: <22.300ms 📉 -13.5%) vs baseline: ~same Memory: ✅ 67.810MB (SLO: <70.500MB -3.8%) vs baseline: +4.3% ✅ exception-replay-enabledTime: ✅ 1.341ms (SLO: <1.450ms -7.5%) vs baseline: +0.4% Memory: ✅ 66.041MB (SLO: <67.500MB -2.2%) vs baseline: +4.8% ✅ iastTime: ✅ 19.311ms (SLO: <22.250ms 📉 -13.2%) vs baseline: ~same Memory: ✅ 67.889MB (SLO: <70.000MB -3.0%) vs baseline: +4.5% ✅ profilerTime: ✅ 15.578ms (SLO: <16.550ms -5.9%) vs baseline: +0.4% Memory: ✅ 55.883MB (SLO: <57.500MB -2.8%) vs baseline: +4.7% ✅ resource-renamingTime: ✅ 19.381ms (SLO: <21.750ms 📉 -10.9%) vs baseline: ~same Memory: ✅ 67.782MB (SLO: <70.500MB -3.9%) vs baseline: +4.3% ✅ span-code-originTime: ✅ 22.847ms (SLO: <28.200ms 📉 -19.0%) vs baseline: +0.3% Memory: ✅ 69.613MB (SLO: <71.000MB 🟡 -2.0%) vs baseline: +5.0% ✅ tracerTime: ✅ 19.290ms (SLO: <21.750ms 📉 -11.3%) vs baseline: ~same Memory: ✅ 67.849MB (SLO: <70.000MB -3.1%) vs baseline: +4.7% ✅ tracer-and-profilerTime: ✅ 21.402ms (SLO: <23.500ms -8.9%) vs baseline: ~same Memory: ✅ 68.997MB (SLO: <71.000MB -2.8%) vs baseline: +4.9% ✅ tracer-dont-create-db-spansTime: ✅ 19.312ms (SLO: <21.500ms 📉 -10.2%) vs baseline: ~same Memory: ✅ 67.751MB (SLO: <70.000MB -3.2%) vs baseline: +4.5% ✅ tracer-minimalTime: ✅ 16.634ms (SLO: <17.500ms -4.9%) vs baseline: +0.3% Memory: ✅ 67.889MB (SLO: <70.000MB -3.0%) vs baseline: +4.9% ✅ tracer-nativeTime: ✅ 19.346ms (SLO: <21.750ms 📉 -11.1%) vs baseline: +0.5% Memory: ✅ 68.085MB (SLO: <72.500MB -6.1%) vs baseline: +4.9% ✅ tracer-no-cachesTime: ✅ 17.334ms (SLO: <19.650ms 📉 -11.8%) vs baseline: ~same Memory: ✅ 67.771MB (SLO: <70.000MB -3.2%) vs baseline: +4.8% ✅ tracer-no-databasesTime: ✅ 18.710ms (SLO: <20.100ms -6.9%) vs baseline: -0.3% Memory: ✅ 67.849MB (SLO: <70.000MB -3.1%) vs baseline: +5.0% ✅ tracer-no-middlewareTime: ✅ 18.953ms (SLO: <21.500ms 📉 -11.8%) vs baseline: -0.4% Memory: ✅ 67.751MB (SLO: <70.000MB -3.2%) vs baseline: +4.9% ✅ tracer-no-templatesTime: ✅ 19.153ms (SLO: <22.000ms 📉 -12.9%) vs baseline: +0.4% Memory: ✅ 67.707MB (SLO: <70.500MB -4.0%) vs baseline: +4.7% 🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.587ms (SLO: <4.750ms -3.4%) vs baseline: -0.3% Memory: ✅ 63.970MB (SLO: <66.500MB -3.8%) vs baseline: +4.9% ✅ appsec-postTime: ✅ 6.619ms (SLO: <6.750ms 🟡 -1.9%) vs baseline: ~same Memory: ✅ 64.459MB (SLO: <66.500MB -3.1%) vs baseline: +4.9% ✅ appsec-telemetryTime: ✅ 4.586ms (SLO: <4.750ms -3.5%) vs baseline: -0.4% Memory: ✅ 64.052MB (SLO: <66.500MB -3.7%) vs baseline: +4.9% ✅ debuggerTime: ✅ 1.854ms (SLO: <2.000ms -7.3%) vs baseline: -0.6% Memory: ✅ 47.885MB (SLO: <49.500MB -3.3%) vs baseline: +5.0% ✅ iast-getTime: ✅ 1.857ms (SLO: <2.000ms -7.2%) vs baseline: +0.2% Memory: ✅ 44.698MB (SLO: <49.000MB -8.8%) vs baseline: +4.8% ✅ profilerTime: ✅ 1.920ms (SLO: <2.100ms -8.6%) vs baseline: -0.6% Memory: ✅ 48.382MB (SLO: <50.000MB -3.2%) vs baseline: +4.9% ✅ resource-renamingTime: ✅ 3.370ms (SLO: <3.650ms -7.7%) vs baseline: -0.2% Memory: ✅ 54.693MB (SLO: <56.000MB -2.3%) vs baseline: +4.8% ✅ tracerTime: ✅ 3.360ms (SLO: <3.650ms -7.9%) vs baseline: ~same Memory: ✅ 54.746MB (SLO: <56.500MB -3.1%) vs baseline: +4.9% ✅ tracer-nativeTime: ✅ 3.359ms (SLO: <3.650ms -8.0%) vs baseline: +0.2% Memory: ✅ 54.537MB (SLO: <60.000MB -9.1%) vs baseline: +4.6% 🟡 iastpropagation - 8/8✅ no-propagationTime: ✅ 48.539µs (SLO: <60.000µs 📉 -19.1%) vs baseline: -0.7% Memory: ✅ 39.479MB (SLO: <40.500MB -2.5%) vs baseline: +4.8% ✅ propagation_enabledTime: ✅ 166.058µs (SLO: <190.000µs 📉 -12.6%) vs baseline: -1.1% Memory: ✅ 39.459MB (SLO: <40.000MB 🟡 -1.4%) vs baseline: +4.8% ✅ propagation_enabled_100Time: ✅ 1.853ms (SLO: <2.300ms 📉 -19.4%) vs baseline: +0.5% Memory: ✅ 39.440MB (SLO: <40.000MB 🟡 -1.4%) vs baseline: +4.7% ✅ propagation_enabled_1000Time: ✅ 32.008ms (SLO: <34.550ms -7.4%) vs baseline: -0.3% Memory: ✅ 39.499MB (SLO: <40.000MB 🟡 -1.3%) vs baseline: +4.4% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 38.722ms (SLO: <47.150ms 📉 -17.9%) vs baseline: +0.7% Memory: ✅ 39.024MB (SLO: <47.000MB 📉 -17.0%) vs baseline: +4.7% ✅ add-metricsTime: ✅ 258.980ms (SLO: <344.800ms 📉 -24.9%) vs baseline: +0.3% Memory: ✅ 43.352MB (SLO: <47.500MB -8.7%) vs baseline: +4.8% ✅ add-tagsTime: ✅ 314.661ms (SLO: <321.000ms 🟡 -2.0%) vs baseline: +0.3% Memory: ✅ 43.411MB (SLO: <47.500MB -8.6%) vs baseline: +5.3% ✅ get-contextTime: ✅ 81.170ms (SLO: <92.350ms 📉 -12.1%) vs baseline: +2.8% Memory: ✅ 39.369MB (SLO: <46.500MB 📉 -15.3%) vs baseline: +4.7% ✅ is-recordingTime: ✅ 36.146ms (SLO: <44.500ms 📉 -18.8%) vs baseline: +0.4% Memory: ✅ 38.954MB (SLO: <47.500MB 📉 -18.0%) vs baseline: +4.9% ✅ record-exceptionTime: ✅ 56.900ms (SLO: <67.650ms 📉 -15.9%) vs baseline: ~same Memory: ✅ 39.538MB (SLO: <47.000MB 📉 -15.9%) vs baseline: +4.9% ✅ set-statusTime: ✅ 42.216ms (SLO: <50.400ms 📉 -16.2%) vs baseline: -0.2% Memory: ✅ 39.000MB (SLO: <47.000MB 📉 -17.0%) vs baseline: +4.8% ✅ startTime: ✅ 35.252ms (SLO: <43.450ms 📉 -18.9%) vs baseline: -0.4% Memory: ✅ 38.992MB (SLO: <47.000MB 📉 -17.0%) vs baseline: +4.9% ✅ start-finishTime: ✅ 81.876ms (SLO: <88.000ms -7.0%) vs baseline: -0.3% Memory: ✅ 36.589MB (SLO: <46.500MB 📉 -21.3%) vs baseline: +4.6% ✅ start-finish-telemetryTime: ✅ 83.535ms (SLO: <89.000ms -6.1%) vs baseline: +0.1% Memory: ✅ 36.707MB (SLO: <46.500MB 📉 -21.1%) vs baseline: +4.8% ✅ update-nameTime: ✅ 36.980ms (SLO: <45.150ms 📉 -18.1%) vs baseline: +0.7% Memory: ✅ 38.999MB (SLO: <47.000MB 📉 -17.0%) vs baseline: +4.9%
|
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
This reverts commit dfc0715.
Description
This PR adds support for server-side MCP calls made via the OpenAI Responses API.
In the Responses API, LLMs can invoke MCP tools on behalf of the client. They do this by asking the provided MCP server to list available tools and then calling the relevant tool.
Our current support for these kinds of interactions is not great: we do not capture any tool calls, tool results, or tool spans.
This PR provides better support by:
McpCalloutput item and parsing it into a Tool Call and Tool Result for the current active LLM spanManual Testing
I manually tested my changes with the following script:
Before
Before, our experience for these types of server-side MCP use cases was very poor. Running the script, I get this trace which does not parse the MCP server side calls:
After
With the changes in this PR, the trace looks much cleaner and correctly parses all the information related to the MCP usage.
Tool Calls and Tool Results are highlighted:
Available Tools from the MCP server are captured:
Separate tool span is emitted:
I also tried this out with more than one tool call in this trace.
Risks
Additional Notes