Skip to content
View notrichardren's full-sized avatar
Block or Report

Block or report notrichardren

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
notrichardren/README.md
  • 👋 Hi, I’m Richard Ren. I'm interested in tech & policy related to artificial intelligence and climate change.
  • 💓 On the technical research side, I'm currently interested in values encoding and embedding complex moral nuance in AI systems via RLHF, robustness, and automated model evaluation. I've recently been working on a list of research proposals to improve safety in LLMs.
  • 🌱 On the policy-relevant data analysis & software tools side, I'm interested in climate adaptation and using satellite data to proxy environmental or economic variables of interest.
  • 📫 hi.richard.ren@gmail.com

Current and recent projects:

  • 🛠 Replicating Toolformer with a Wolfram Alpha API, OpenAI's API, and a two-pass prompting procedure [Link]
  • 🤖 Reward hacking detection through LLMs via OpenAI GPT-4 API calls, in toy gridworld environments [Link]
  • 🗺 Land use and land cover estimates in large European cities via CNN segmentation and classification
  • 📚 Going through the ARENA (Alignment Research Engineering Accelerator) Curriculum: interpretability of transformer circuits, ablation and path-patching, probing, indirect object identification, RL in OpenAI's Gym environment, and training LLMs at scale [Link]
  • 🏹 Generating AI preferences for fine-tuning language models with reinforcement learning with AI feedback
  • 💡 Utilizing inference-time intervention and probing to investigate truthfulness in models [Link]

Pinned

  1. magikarp01/iti_capstone magikarp01/iti_capstone Public

    Analyzing truth representations in LLMs across different kinds of truth and intervening on their hidden states to make LLMs more truthful

    Jupyter Notebook 5 1

  2. arena-curriculum arena-curriculum Public

    Forked from callummcdougall/ARENA_2.0

    Exercises on mechanistic interpretability, RL, and training models at scale

    Jupyter Notebook

  3. jamescampbell57/llama-lying jamescampbell57/llama-lying Public

    Jupyter Notebook 10 1