Skip to content
View vatsalaggarwal's full-sized avatar

Organizations

@metavoiceio

Block or report vatsalaggarwal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vatsalaggarwal/README.md

Hi there 👋

I've been trying to enable a world with personalized content generation since the summer of 2015. I graduated with a Mathematics degree and then built the first end-to-end Generative AI-based model for Alexa's voice (which was a dramatic improvement over the previous stuff that went into Hawking's voice). I then spent a couple of years researching data efficiency for Generative AI models, unsupervised learning, and disentanglement (learning interpretable factors in speech generation - like speaker identity/accent - and being able to control them in human-interpretable ways). Some of this work has been published and patented. Toward the end of my time at Amazon, I spent some time working with Sparse Gaussian Processes in the Inventory Optimisation team. I left quickly after realizing my heart was still more interested in what I had to set out to do in the summer of 2015 i.e. personalized content generation, and started a company with a friend to enable that world instead.

Over the past 4.5 years of my professional career, I've worked extensively with VAEs, Normalising Flows, various Transformer models, & GANs. Even though I trained my first waveform level "language model" (convolutional, not transformer :)) in 2018 (it took more than 2 months for a single model to converge back in 2017), I continue to be shocked daily by their power and capabilities. Discovering what they can do continues to feel like magic.

If this world interests you, please reach out for a coffee, I'd love to meet you.

Popular repositories Loading

  1. whisper-cli whisper-cli Public

    A command-line interface for transcribing and translating audio using OpenAI's Whisper API.

    Python 23 3

  2. blockchain-text-to-sql blockchain-text-to-sql Public

    JavaScript 6 2

  3. munki-pkg munki-pkg Public

    Forked from munki/munki-pkg

    Repo for the munkipkg tool and example projects

    Python

  4. cloud-instance-management cloud-instance-management Public

    CLI to manage cloud instance workflows via one-liners.

    Python

  5. posthog-js posthog-js Public

    Forked from PostHog/posthog-js

    posthog-js allows you to send usage data from JS/TS product to PostHog, with autocapture.

    TypeScript

  6. vercel-for-python vercel-for-python Public

    Combines PyneCone and Modal to bring alive the "Vercel for Python" dream.