Blue_team_MechanisticInterpretability

The goal of mechanistic interpretability is to take a trained model and reverse engineer the algorithms the model learned during training from its weights . we have no idea how they work nor how to write one ourselves .
So our project is to Investigating Agent Behavior In different Reinforcement Learning methods by optimizing strategy to gain as much reward as possible.
This notebook presents methods for pruning different Reinforcement Learning algorithms that show different agent behavior in different models, to facilitate research into understanding the behavior of these strategies. Implementing and vitalizing Dynamic Programming, Monte Carlo (MC), and Temporal difference (TD) algorithms and comparing their results. The pruning algorithm takes a given dataset that has 2 cases to show the shortest path depending on how the agent learns from their behavior where probability are given .

Algorithms model and non-model based

- 1: Policy & Value Iteration
- 2: Monte Carlo
- 3: Temporal Difference

How to use is ?

Download the notebook and you can run it using Jupyter or Google collab

Dataset

Set your correct path to upload dataset . Data was genereted randomly you can choose any row by ID(Neptune) which has 2 cases feel free to play with it and see how the agent behave .
data

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
FinalSubmetion.ipynb		FinalSubmetion.ipynb
Finetuinging_to_Interactive_Neuroscope_distil_gpt-2.ipynb		Finetuinging_to_Interactive_Neuroscope_distil_gpt-2.ipynb
Interactive_Neuroscope_for GPT-2 finetuning.ipynb		Interactive_Neuroscope_for GPT-2 finetuning.ipynb
Interpretability hackathonBlueTeam.pdf		Interpretability hackathonBlueTeam.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

FinalSubmetion.ipynb

FinalSubmetion.ipynb

Finetuinging_to_Interactive_Neuroscope_distil_gpt-2.ipynb

Finetuinging_to_Interactive_Neuroscope_distil_gpt-2.ipynb

Interactive_Neuroscope_for GPT-2 finetuning.ipynb

Interactive_Neuroscope_for GPT-2 finetuning.ipynb

Interpretability hackathonBlueTeam.pdf

Interpretability hackathonBlueTeam.pdf

README.md

README.md

Repository files navigation

Blue_team_MechanisticInterpretability

Algorithms model and non-model based

How to use is ?

Dataset

About

Releases

Packages

Languages

Mohammed20201991/Blue_team_MechanisticInterpretability

Folders and files

Latest commit

History

Repository files navigation

Blue_team_MechanisticInterpretability

Algorithms model and non-model based

How to use is ?

Dataset

About

Resources

Stars

Watchers

Forks

Languages