-
Notifications
You must be signed in to change notification settings - Fork 37
Update file system config for crc #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Bill Murdock <bmurdock@redhat.com>
PR Review: Update file system config for CRCSummaryThis PR adds persistent volume support to the local development backend deployment to fix a file system issue where message persistence was failing with "no such file or directory" errors. ✅ Positive Aspects
🔍 Issues & RecommendationsCritical: Missing Directory InitializationThe root cause of your intermittent success is likely a race condition with directory creation. Looking at the backend code ( _ = os.MkdirAll(fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID), 0o755)This creates the session-specific directory, but there's a potential issue:
Recommendation: Add an init container or startup command to ensure the base directory structure exists: # In backend-deployment.yaml, add:
initContainers:
- name: init-workspace
image: quay.io/ambient_code/vteam_backend:latest
command: ['sh', '-c', 'mkdir -p /workspace/sessions && chmod 755 /workspace/sessions']
volumeMounts:
- name: backend-state
mountPath: /workspaceLow: Inconsistent Label ValuesThe PVC uses Recommendation: Change labels:
app: backend-api # Match deployment label
component: state-storageInfo: Missing storageClassNameCRC will use its default storage class, but for clarity and to match production expectations, consider explicitly setting: spec:
storageClassName: crc-csi-hostpath-provisioner # CRC's default
accessModes:
- ReadWriteOnceBackend Code: Silent Error HandlingIn _ = os.MkdirAll(fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID), 0o755)Recommendation: Log errors for debugging (separate PR): if err := os.MkdirAll(fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID), 0o755); err != nil {
log.Printf("Failed to create session directory: %v", err)
return
}🧪 Testing Recommendations
📋 Pre-Merge Checklist
SummaryThe PR is on the right track and correctly identifies the need for persistent storage. The main issue is likely empty PVC initialization causing intermittent failures. Adding an init container should resolve the "no such file or directory" errors consistently. Recommended Action: Add the init container, test with a clean PVC, and this should be ready to merge. Review based on CLAUDE.md guidelines and production deployment patterns |
Signed-off-by: Bill Murdock <bmurdock@redhat.com>
Code Review: Update file system config for crcI've reviewed the changes in this PR. Overall, this is a solid fix for the filesystem persistence issue in the local CRC development environment. Here's my detailed feedback: ✅ What's Working Well
🔍 Code Quality Issues1. Missing Storage Class (Medium Priority)The PVC doesn't specify a spec:
accessModes:
- ReadWriteOnce
storageClassName: "" # Use default storage class explicitly
resources:
requests:
storage: 5GiOr for CRC-specific storage: storageClassName: crc-csi-hostpath-provisioner # CRC's default2. Init Container Permissions (Low Priority)The init container command uses
Current: command: ['sh', '-c', 'mkdir -p /workspace/sessions && chmod 755 /workspace/sessions']Consider: command: ['sh', '-c', 'mkdir -p /workspace/sessions && chmod 750 /workspace/sessions']3. Missing SecurityContext (Medium Priority)According to the CLAUDE.md Backend Development Standards, Job pods should have SecurityContext. The backend deployment is missing this: containers:
- name: backend
image: ...
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
capabilities:
drop: ["ALL"]This should be added for production-readiness, though it may not be the cause of the current issue. 🐛 Potential Issues1. Init Container Image Pull (High Priority)The init container uses: image: image-registry.openshift-image-registry.svc:5000/vteam-dev/vteam-backend:latestThis assumes the backend image is already built and available. If someone runs
Recommendation: Add error handling or use a busybox-based init container: initContainers:
- name: init-workspace
image: registry.access.redhat.com/ubi9/ubi-minimal:latest
command: ['sh', '-c', 'mkdir -p /workspace/sessions && chmod 750 /workspace/sessions']
volumeMounts:
- name: backend-state
mountPath: /workspaceThis is more reliable and doesn't depend on the backend image. 2. Directory Creation in Application Code (Low Priority)Looking at _ = os.MkdirAll(fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID), 0o755)The code already tries to create the directory. The issue is that Why the init container works: It creates the parent Alternative approach: Fix the application code to create the full path: sessionDir := fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID)
if err := os.MkdirAll(sessionDir, 0o755); err != nil {
log.Printf("Failed to create session directory: %v", err)
return
}This should already work, so the issue might be permissions on 📋 Testing Recommendations
🔒 Security Considerations
🎯 Performance Considerations
✨ Suggestions for Improvement
📝 SummaryRecommendation: Approve with minor changes The PR solves the immediate problem and follows the right patterns. The main improvements needed are:
Priority fixes before merge:
Can be addressed in follow-up PRs:
Great work identifying and fixing this issue! 🎉 |
Code Review: Update file system config for crcThanks for working on this issue! I've reviewed the PR and have several observations and suggestions. ✅ Positive Aspects
🔍 Issues and Recommendations1. Init Container Security Context MissingIssue: The init container lacks a From CLAUDE.md:
Recommendation: Add security context to the init container: initContainers:
- name: init-workspace
image: image-registry.openshift-image-registry.svc:5000/vteam-dev/vteam-backend:latest
command: ['sh', '-c', 'mkdir -p /workspace/sessions && chmod 755 /workspace/sessions']
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
volumeMounts:
- name: backend-state
mountPath: /workspace2. Init Container May Be UnnecessaryIssue: The // websocket/hub.go:165
_ = os.MkdirAll(fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID), 0o755)Question: Is the init container actually needed, or does the issue stem from the PVC not being mounted at all in the original deployment? Recommendation: Test whether removing the init container still works. If the backend creates directories correctly, the init container adds unnecessary complexity. 3. Missing Resource LimitsIssue: Neither the PVC creation step nor the init container have resource constraints. Recommendation:
initContainers:
- name: init-workspace
# ...
resources:
requests:
memory: "32Mi"
cpu: "10m"
limits:
memory: "64Mi"
cpu: "50m"4. PVC Creation TimingIssue: The PVC is created immediately before the backend deployment ( Recommendation: log "Creating backend PVC..."
oc apply -f "${MANIFESTS_DIR}/backend-pvc.yaml" -n "$PROJECT_NAME"
# Wait for PVC to be bound
log "Waiting for PVC to be bound..."
oc wait --for=condition=bound pvc/backend-state-pvc -n "$PROJECT_NAME" --timeout=60s || {
log "WARNING: PVC did not bind within 60 seconds"
}5. Storage Class Not SpecifiedIssue: The PVC does not specify a Recommendation: Explicitly set the storage class for CRC: spec:
storageClassName: crc-csi-hostpath-provisioner # CRC's default
accessModes:
- ReadWriteOnce🧪 Testing Recommendations
🐛 Potential Root Cause AnalysisBased on the error message you shared: The issue is likely NOT that the directory does not exist (since
Your fix (adding the PVC) should resolve issue #1, which is the most likely culprit. 📋 Suggested Changes Summary
🎯 Next Steps
Overall, this is a solid fix that aligns with the production deployment pattern. With the minor adjustments above, it should reliably solve the message persistence issue in CRC. Great debugging work! 🚀 |
Signed-off-by: Bill Murdock <bmurdock@redhat.com>
Code Review: Update file system config for crcThank you for addressing the local development filesystem issue! This PR correctly identifies and fixes the root cause of the message persistence problem. Positive Aspects
Critical Issues1. StorageClassName Hardcoded to CRCFile: components/scripts/local-dev/manifests/backend-pvc.yaml:11 Problem: Impact: Production PVC omits storageClassName to use cluster default. This divergence breaks portability. Recommendation: Remove the storageClassName field entirely to match production. 2. No PVC Binding VerificationFile: components/scripts/local-dev/crc-start.sh:307-308 Problem: No verification that PVC is bound before deploying backend. Recommendation: Add wait condition to ensure PVC is bound before proceeding. 3. InitContainer Image Not PinnedFile: components/scripts/local-dev/manifests/backend-deployment.yaml:20 Recommendation: Pin to specific version (e.g., ubi9/ubi-minimal:9.5) instead of :latest Testing Recommendations
SummaryOverall Assessment: ✅ Approve with Minor Changes This PR correctly fixes the root cause. Implementation is sound and follows security best practices. Primary blocker: Remove hardcoded storageClassName to match production Secondary improvements: Add PVC binding verification, update documentation Great work tracking this down! Once storageClassName is addressed, this should resolve the issues. Related Files
|
Signed-off-by: Bill Murdock <bmurdock@redhat.com>
Code Review: Update file system config for CRCThanks for working on this important fix! I've reviewed the PR and have several observations about the implementation. The core approach of adding a PVC for persistent storage is sound, but there are some areas that need attention. 🎯 SummaryThis PR addresses the
✅ What's Working Well1. Root Cause Analysis
2. CONTRIBUTING.md
3. Init Container Pattern
|
Signed-off-by: Bill Murdock <bmurdock@redhat.com>
PR Review: Update file system config for crcI've reviewed PR #216 and the changes look good overall. The fix properly addresses the filesystem permission issue in CRC by adding a PVC and init container. Here's my detailed feedback: ✅ Strengths
🔍 Issues & Recommendations1. Security Context Missing on Backend Container (Medium Priority)The production manifest ( CLAUDE.md requirement (line 1050-1057): // Always set SecurityContext for Job pods
SecurityContext: &corev1.SecurityContext{
AllowPrivilegeEscalation: boolPtr(false),
ReadOnlyRootFilesystem: boolPtr(false), // Only if temp files needed
Capabilities: &corev1.Capabilities{
Drop: []corev1.Capability{"ALL"},
},
},Recommendation: Add a containers:
- name: backend
image: image-registry.openshift-image-registry.svc:5000/vteam-dev/vteam-backend:latest
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
# ... rest of container specThis should be applied to both the local-dev and production manifests for consistency. 2. Init Container Permissions (Low Priority)The init container runs as root (by default) to create directories. While this works, consider if it's necessary: Current approach: initContainers:
- name: init-workspace
command: ['sh', '-c', 'mkdir -p /workspace/sessions && chmod 755 /workspace/sessions']Alternative consideration: If the PVC mounts with proper permissions, the init container might not be needed at all since _ = os.MkdirAll(fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID), 0o755)Question: Does CRC require the parent 3. Storage Class Hardcoding (Medium Priority)The PVC manifest hardcodes the CRC storage class: storageClassName: crc-csi-hostpath-provisionerIssue: This will only work in CRC environments. If someone tries to use these manifests elsewhere, it will fail. Recommendation: Make this configurable or document it clearly. Options:
The production manifest ( 4. Error Handling in persistMessageToS3 (Low Priority)Line 165 in _ = os.MkdirAll(fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID), 0o755)While the subsequent if err := os.MkdirAll(fmt.Sprintf("%s/sessions/%s", StateBaseDir, message.SessionID), 0o755); err != nil {
log.Printf("persistMessage: MkdirAll failed: %v", err)
return
}This isn't a blocker for this PR since it's existing code, but worth noting for future improvements. 📝 Documentation QualityThe CONTRIBUTING.md file is excellent:
🧪 Testing ValidationQuestion for the author: Have you tested that:
⚡ Performance ConsiderationsNo performance concerns - the changes are minimal and appropriate for local development. 🎯 Recommendations SummaryBefore merging:
Optional improvements (can be separate PRs): ConclusionThis is a solid fix that addresses a real issue preventing local development with CRC. The changes are well-thought-out and the documentation additions are valuable. With the minor security enhancements mentioned above, this will be ready to merge. Great work identifying and fixing this issue! 🎉 |
Gkrumbach07
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great
Yahav and I were both having trouble running vTeam locally with crc. We were able to make projects and workspace and start an ideate session but then when we get to the message tab, we see no messages and get no response when we send one.
I looked into the OpenShift console and saw errors like this:
This PR is addresses the problem. It also adds a new CONTRIBUTING.md file with a section called "Local Development Setup" with a lot of content about getting and using CRC. The CONTRIBUTING.md file also has a lot of generic CONTRIBUTING.md stuff too, but I included it in this PR because the "Local Development Setup" is important for replicating my results showing that this works.
I've addressed most of the CodeRabbit concerns. The remaining ones seem not really applicable here because these are CRC-specific local development manifests. The storage class is required as CRC has no default storage class (verified in testing). Production manifests are separate in components/manifests/.