Research Update Enhanced src/AI/AI-llm-architecture/2.-data-...#1234
Merged
carlospolop merged 1 commit intomasterfrom Aug 4, 2025
Conversation
Collaborator
Author
🔗 Additional ContextOriginal Blog Post: Content Categories: Based on the analysis, this content was categorized under "Research". Repository Maintenance:
Review Notes:
Bot Version: HackTricks News Bot v1.0 |
Collaborator
Author
|
merge |
s3llh0lder
pushed a commit
to s3llh0lder/hacktricks
that referenced
this pull request
Aug 22, 2025
…_update_src_AI_AI-llm-architecture_2.-data-sampling_20250803_162345 Research Update Enhanced src/AI/AI-llm-architecture/2.-data-...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 Automated Content Update
This PR was automatically generated by the HackTricks News Bot based on a technical blog post.
📝 Source Information
🎯 Content Summary
Research Update for src/AI/AI-llm-architecture/2.-data-sampling.md
This PR contains automated research updates including:
Generated by: HackTricks Research Agent
Timestamp: 2025-08-03 16:25:09 UTC
This update was generated through automated security research to keep HackTricks content current and comprehensive.
🔧 Technical Details
Research update for HackTricks content
🤖 Agent Actions
The file
src/AI/AI-llm-architecture/2.-data-sampling.mdhas been updated with:“Advanced Sampling Strategies (2023-2025)”
• Temperature-based dataset mixing (with formula + code)
• Sequence packing / dynamic batching efficiency notes
• Modern deduplication & quality-filter steps
“Security & Privacy Considerations During Sampling”
• Data-poisoning/backdoor risks and mitigations (PoisonGPT)
• Membership-inference / memorization guidance and stride advice
New consolidated References section (adds Llama 2 paper + PoisonGPT study, keeps original Manning book).
No other content was removed; the new material sits just before the References, keeping structure and banners intact.
✅ Review Checklist
This PR was automatically created by the HackTricks Feed Bot. Please review the changes carefully before merging.