Skip to content

kaihuchen/articles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Banner Image

List of GenAI Articles, Code Repositories, and Apps

Ensemble GenAI

GenAI for the Enterprises

  • Navigating the Challenges of Large-scale Chatbot Deployment: When considering the integration of GenAI chatbots within an enterprise's operations, it's crucial to recognize the potential challenges. Key insights include:

    • The large-scale and abrupt replacement of human workforce with GenAI chatots (e.g., in helpdesk operations) can be very risky, which often leads to significant service disruptions for various reasons.
    • Flexible and low-risk transition to GenAI can be achieved, by designing GenAI chatbots not as direct replacement, but as productivity enhancer to the existing workforce.

    Published: April 4, 2024

  • Brainstorming with Chatbots on building a Universal Information Worker Chatbot: How to use an ensemble of debating chatbot to design and implement a Universal Information Worker chatbot for enterprises.
    Published: March 14, 2024

  • The Manchurian Chatbot Problem: Are Chatbot Viruses Coming Our Way?: Large Language Model (LLM) chatbots are becoming more advanced, but this progress could also lead to the "Manchurian Candidate" problem: the risk of adversaries implanting malicious data into LLMs' long-term memories, which could be triggered to cause harm.
    Published: March 12, 2024

  • Building Enterprise GenAI Chatbots: It is important to recognize that enterprise GenAI chatbots are essentially the latest iteration of traditional enterprise software applications. As such, many of the same challenges faced during their development cycles should not be overlooked or underestimated.
    Published: Feb 27, 2024

GenAI Vision Applications

  • Navigating the Pitfalls of Vision Language Models: this is a story about how a winning Claude-3 lose an image classification debate to GPT-4V, due to its weak personality. It is also about a cougar on house deck, and how AI ecould mistaken it as just a dog.
    Published: March 18, 2024

  • Is Anthropic's Claude-3 Ready for AGI?: Does Anthropic's multimodal Claude-3 have enough visual common sense to support AGI (Artificial General Intelligence)? It is actually kind of close.

    Published: March 4, 2024

  • Is OpenAI's GPT-4V Ready for AGI?: Does OpenAI's vision model GPT-4V have enough visual common sense to support AGI (Artificial General Intelligence)? It is actually kind of close.

    We ran 17 types of visual common sense tests against GPT-4V to find out how well it can deal with the real world, and here are the results.

    Published: Feb 29, 2024

  • LMM for Level 5 Autonomous Driving: testing out the potential of using LMM (Large Multimodal Models) under the scenario of Level 5 autonomous driving, demonstrating LMM's capabilities in visual recognition, scene analysis, commonsense responses, and explanation.
    Published: Feb 2024

  • LMM for a Home GuandianBot: (draft) testing out the potential of using LMM (Large Multimodal Models) under the scenario of a home robot for detecting any hazards or threats, taking actions or escalating alerts as appropriate.
    Unpublished

  • (Upcoming) Is OpenAI's GPT-4V vision model sufficient to support AGI?

Hidden Problems in GenAI

  • Hallucination due to flawed Abductive Reasoning: My experiments show that most chatbots built from LLMs (Large Language Models) struggle with abductive reasoning, which often leading to incorrect or "hallucinated" responses that aren't easy to spot.
    Published: Feb 24, 2024

Apps

  • Online apps
    • OpenAI GPTs Store (an OpenAI account is needed to access)
      • Creative Imaginator: upload an image, select one of many predefined style, and the app automatically generate a new image that is different from the original in creative ways, but still follow the main theme in the original image.

        This is basically an image-to-text-to-image tool with support for style injection, which is easier to manage than text-to-image system (which requires substantial prompt engieering skill), or the image-to-image approach (which is best for making small changes).

        This tool is ideal for anyone who just need to have some illustrations. A gallery is available here for viewing.

      • Automatic Backseat Driver: upload a road scene, and have the scene analyzed in detail along with the recommended high-level action of whether to proceed, stop, or turn around.

        This is a tool to test how well OpenAI's vision model GPT-4V can be used as a high-level component for driving a Level 5 Autonomous Vehicle.

        See this gallery for a long list of road scenes tested with this tool, which demonstrated that GPT-4V has performed surprising well even in highly challenging situations.

      • Radiologist V2:

Additional Resources

About

Holding place for various articles

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published