life is chaotic, this goes slower than expected. for ideas/thoughts, lmk
Assess and interpret reasoning capabilities of large language models
Reasoning Assessment Dimensions
- Logical Reasoning: evaluation of deductive and inductive reasoning
- Causal Inference: mapping reasoning pathways and decision-making processes
- Semantic Understanding: analyzing depth and nuance of contextual comprehension / text prompts
- Multimodal testing / Input-structure variation: TBD
- ** ... **: TBD
-
Mechanistic Interpretability
- Neuron-level level/ Attention mechanism visualization / ..
-
Explainability Frameworks
- Gradient-based mapping, LIME, SHAP (todo: cite)
Reasoning Performance Indicators
Metric | Description | Measurement Approach |
---|
- Advanced multi-modal reasoning evaluations
- Cross-model comparative studies
- Development of novel interpretability techniques
git clone https://github.com/CorpaciLC/LLM-reasoning.git
cd LLM-reasoning
pip install -r requirements.txt
python setup.py develop