Richard Ren notrichardren

👋 Hi, I’m Richard Ren. I'm interested in tech & policy related to artificial intelligence and climate change.
💓 On the technical research side, I'm currently interested in values encoding and embedding complex moral nuance in AI systems via RLHF, robustness, and automated model evaluation. I've recently been working on a list of research proposals to improve safety in LLMs.
🌱 On the policy-relevant data analysis & software tools side, I'm interested in climate adaptation and using satellite data to proxy environmental or economic variables of interest.
📫 hi.richard.ren@gmail.com

Current and recent projects:

🛠 Replicating Toolformer with a Wolfram Alpha API, OpenAI's API, and a two-pass prompting procedure [Link]
🤖 Reward hacking detection through LLMs via OpenAI GPT-4 API calls, in toy gridworld environments [Link]
🗺 Land use and land cover estimates in large European cities via CNN segmentation and classification
📚 Going through the ARENA (Alignment Research Engineering Accelerator) Curriculum: interpretability of transformer circuits, ablation and path-patching, probing, indirect object identification, RL in OpenAI's Gym environment, and training LLMs at scale [Link]
🏹 Generating AI preferences for fine-tuning language models with reinforcement learning with AI feedback
💡 Utilizing inference-time intervention and probing to investigate truthfulness in models [Link]

Provide feedback