feat: intro none engine for performance.#395
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a 'None' storage engine for in-memory task processing and implements session limits for the controller's cache to prevent resource exhaustion. Key changes include the NoneEngine implementation, configuration updates for resource limits, and an eviction mechanism in the storage layer. Review feedback identifies several critical issues: the session eviction logic is only triggered during session closure and lacks a proper 'oldest-first' policy; error handling in close_session is currently suppressed, which may mask failures for persistent engines; and the NoneEngine implementation for application lookups and session deletion will cause failures in the storage API due to unhandled NotFound errors.
| Ok(()) | ||
| } | ||
|
|
||
| fn evict_sessions(&self) -> Result<(), FlameError> { |
There was a problem hiding this comment.
| ssn.clone() | ||
| }; | ||
|
|
||
| let _ = self.engine.close_session(id.clone()).await; |
There was a problem hiding this comment.
Ignoring the result of self.engine.close_session() is a regression in error handling. While this was likely done because the NoneEngine returns NotFound, it prevents persistent engines (SQLite, Filesystem) from reporting actual I/O or database failures. A better approach would be to specifically handle NotFound errors while propagating others.
| let _ = self.engine.close_session(id.clone()).await; | |
| if let Err(e) = self.engine.close_session(id.clone()).await { | |
| if !matches!(e, FlameError::NotFound(_)) { | |
| return Err(e); | |
| } | |
| } |
| async fn get_application(&self, id: ApplicationID) -> Result<Application, FlameError> { | ||
| Err(FlameError::NotFound(format!("application <{}>", id))) | ||
| } |
There was a problem hiding this comment.
Returning NotFound here will cause Storage::get_application to fail, as that layer delegates directly to the engine for application lookups. Since NoneEngine is intended to rely on the controller's cache, either NoneEngine must maintain its own in-memory map of applications, or the Storage layer must be updated to check its cache before calling the engine.
| async fn delete_session(&self, id: SessionID) -> Result<Session, FlameError> { | ||
| self.remove_task_counter(&id)?; | ||
| Err(FlameError::NotFound(format!("session <{}>", id))) | ||
| } |
There was a problem hiding this comment.
Returning Err(FlameError::NotFound) here will cause Storage::delete_session to fail at the engine call, making it impossible to delete sessions when using the NoneEngine. Since NoneEngine does not persist data, it should ideally return a successful result. However, the Engine trait requires returning the deleted Session object, which NoneEngine does not track. This indicates a mismatch between the Engine trait assumptions and the NoneEngine design.
| let closed_session_id: Option<SessionID> = ssn_map.iter().find_map(|(id, ssn_ptr)| { | ||
| let ssn = lock_ptr!(ssn_ptr).ok()?; | ||
| if ssn.status.state == SessionState::Closed { | ||
| Some(id.clone()) | ||
| } else { | ||
| None | ||
| } | ||
| }); | ||
|
|
||
| if let Some(ssn_id) = closed_session_id { | ||
| ssn_map.remove(&ssn_id); | ||
| tracing::debug!( | ||
| "Evicted closed session <{}> from cache (limit: {})", | ||
| ssn_id, | ||
| max | ||
| ); | ||
| } |
There was a problem hiding this comment.
The current eviction logic has two issues:
- It uses
find_mapon aHashMapiterator, which results in arbitrary eviction of any closed session. This does not adhere to the "oldest" eviction policy described in the design document (which would require sorting bycompletion_time). - It only evicts a single session per call. If the cache size significantly exceeds the limit (e.g., if many sessions were created without eviction being triggered), it will take many operations to bring the cache back within the limit. Consider using a loop to evict sessions until the limit is reached.
- Add NoneEngine storage backend that keeps all data in memory - Implement session limits with oldest-first eviction policy - Add in-memory application cache for NoneEngine - Handle NotFound errors gracefully for NoneEngine operations - Add benchmark CI workflow with none storage configuration - Fix task state updates to work with NoneEngine
fix #394