# Build General Corpus

The first step is to build our general corpus, this is our static asset. We've taken some material from the meetup web page. We'll now use our "unnofficial" SDK to perform this task.

In [None]:
%pip install -q vectara-skunk-client

## Initialize our Client
We've tried to make this SDK as streamlined as
possible to reduce boilerplate in your codebase. Behind the scenes
this code is using implicit configuration to use our OAuth2
authentication which provides access to all admin APIs.

In [None]:
from vectara_client.core import Factory
from vectara_client.admin import CorpusBuilder
import logging

logging.basicConfig(format='%(asctime)s:%(name)-35s %(levelname)s:%(message)s', level=logging.INFO, datefmt='%H:%M:%S %z')
logging.getLogger("OAuthUtil").setLevel(logging.WARNING)
logger = logging.getLogger(__name__)


client = Factory().build()
manager = client.corpus_manager

## Create our Corpus
We'll now use convenience class "CorpusManager" to create our first corpus "meetup-general". This has no special configuration.

In [None]:
corpus = CorpusBuilder("meetup-general").build()
corpus_id = manager.create_corpus(corpus, delete_existing=True)

## Load our Corpus
We'll now load our general corpus with content from the folder "../resources/general"

We can directly ingest data in Word (docx) format as well as many others.

In [None]:
from pathlib import Path

for path in Path("../resources/general").glob("*.docx"):
    client.indexer_service.upload(corpus_id, path)
    

## Test the Corpus
We'll now run a few test questions to confirm we get a good response

In [None]:
response = client.query_service.query(
    "what is the motto for DataEngBytes?", corpus_id, summary=True, 
    summarizer="vectara-summary-ext-v1.3.0", summary_result_count=5)
logger.info(f"Response was: {response.summary[0].text}")

In [None]:
response = client.query_service.query(
    "Who is the event organizer?", corpus_id, summary=True, 
    summarizer="vectara-summary-ext-v1.3.0", summary_result_count=5)
logger.info(f"Response was: {response.summary[0].text}")