Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This repository contains detailed information related to Oracle Cloud Infrastructure GPU compute instances.

Reviewed: 26.02.2024
Reviewed: 16.10.202

# Table of Contents

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This repository provides a step-by-step deployment of DeepSpeed training for Lar

This setup includes a tuned DeepSpeed configuration (`tuned_ds_config.json`) that provides up to **13% performance improvement** over standard configurations.

Reviewed: 06.06.2025
Reviewed: 16.10.2025
# When to use this asset?

Use this asset when you need to:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ and
using
[Docker Compose](https://docs.docker.com/compose/).

Reviewed: 20.05.2025
Reviewed: 16.10.2025

# When should this asset be used?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ on the Oracle Container Engine for Kubernetes (OKE) using
Reference results from NVIDIA to train Llama 3 can be found on the
[NGC Catalog](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/dgxc-benchmarking/resources/llama3-dgxc-benchmarking).

Reviewed: 01.07.2025
Reviewed: 16.10.2025

# When to use this asset?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This repository intends to demonstrate how to deploy [NVIDIA NIM](https://develo

The model used is `Llama2-7B-chat`, running on an NVIDIA A10 Tensor Core GPU hosted on OCI. For scalability, we are hosting the model repository on a Bucket in Oracle Cloud Object Storage.

Reviewed 23.05.2024
Reviewed 16.10.2025

# When to use this asset?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This repository is a variant of the Retrieval Augmented Generation (RAG) tutorial available [here](https://github.com/oracle-devrel/technology-engineering/tree/main/ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai). Instead of the OCI GenAI Service, it uses a local deployment of Mistral 7B Instruct v0.3 using a vLLM inference server powered by an NVIDIA A10 GPU.

Reviewed: 23.05.2024
Reviewed: 16.10.2025

# When to use this asset?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This repository intends to demonstrate how to deploy NVIDIA Triton Inference Server on Oracle Kubernetes Engine (OKE) with TensorRT-LLM Backend in order to server Large Language Models (LLMs) in a Kubernetes architecture.

Reviewed 23.05.2024
Reviewed 16.10.2025

# When to use this asset?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This repository provides a step-by-step tutorial for deploying and using Mixtral 8x7B Large Language Model using the NVIDIA Triton Inference Server and the TensorRT-LLM backend.

Reviewed: 23.05.2024
Reviewed: 16.10.2025

# When to use this asset?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This repository provides a step-by-step tutorial for deploying and using [Mistral 7B Instruct](https://mistral.ai/technology/#models) Large Language Model using the [vLLM](https://github.com/vllm-project/vllm?tab=readme-ov-file) library.

Reviewed: 23.05.2024
Reviewed: 16.10.2025

# When to use this asset?

Expand Down