Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lookup Company CIK by Name #5

Open
brockelmore opened this issue Feb 12, 2019 · 0 comments
Open

Lookup Company CIK by Name #5

brockelmore opened this issue Feb 12, 2019 · 0 comments

Comments

@brockelmore
Copy link

Hi there,

My fork is highly specialized to my needs at this point but thought I would post my code for how I do a company lookup by name and get the corresponding CIK.

changes to parser.go

func cikPostPageParser(page io.Reader) (string, error) {
	doc, _ := html.Parse(page)
	r := regexp.MustCompile(`CIK=[+]?\d{2,}$`)
	var CIK string
	var f func(*html.Node)
	f = func(n *html.Node) {
	    if n.Type == html.ElementNode && n.Data == "a" {
	        for _, a := range n.Attr {
	            if a.Key == "href" {
			m := r.FindStringSubmatch(a.Val)
			if len(m) > 0 {
				CIK = strings.Split(m[0], "=")[1]
			}
	                break
	            }
	        }
	    }
	    for c := n.FirstChild; c != nil; c = c.NextSibling {
	        f(c)
	    }
	}
	f(doc)
	if CIK != "" {
		for len(CIK) < 10 {
			CIK = "0" + CIK
		}
		return CIK, nil
	}
	return CIK, errors.New("Could not find CIK")
}

func postPage(url1 string, cn string) io.ReadCloser {
	resp, err := http.PostForm(url1, url.Values{"company": {cn}})
	if err != nil {
		log.Fatal("Query to SEC page ", url1, "failed: ", err)
		return nil
	}
	return resp.Body
}

changes page.go

var (
	baseURL   string = "https://www.sec.gov/"
	cikURL    string = "https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&output=xml&CIK=%s"
	backupCIK string = "https://www.sec.gov/cgi-bin/cik_lookup"
	queryURL  string = "cgi-bin/browse-edgar?action=getcompany&CIK=%s&type=%s&dateb=&owner=exclude&count=10"
	searchURL string = baseURL + queryURL
)

func postPage(url1 string, cn string) io.ReadCloser {
	resp, err := http.PostForm(url1, url.Values{"company": {cn}})
	if err != nil {
		log.Fatal("Query to SEC page ", url1, "failed: ", err)
		return nil
	}
	return resp.Body
}


func getCompanyCIK(ticker string) string {
	fmt.Println("getting company CIK")
	var t bool
	if strings.Contains(ticker, " ") {
                // If the "ticker" has a space in it, we assume it is a company name
		t = true
	} else {
                // Otherwise we assume it is a ticker and try
		url1 := fmt.Sprintf(cikURL, ticker)
		r := getPage(url1)
		rb, _ := ioutil.ReadAll(r) //this is inefficient but upstream it requires an unclosed resp.Body which means I can't test to see if ticker worked fine or not without having to make this call and one later
		t = strings.Contains(string(rb),"No matching Ticker Symbol.")
	}
	switch {
	case t == false:
		url1 := fmt.Sprintf(cikURL, ticker) 
		r2  := getPage(url1) //the inefficient second call
		if cik, err := cikPageParser(r2); err == nil {
			return cik
		}
	case t == true:
		r := postPage(backupCIK, ticker)
		if r != nil {
			if cik, err := cikPostPageParser(r); err == nil {
				fmt.Println(cik)
				return cik
			}
		}
	default:
		fmt.Println("in default")
	   return ""
	}
	return ""
}

It works, its not pretty, but reduces limitations on searching just by CIK or Symbol (as many smaller ones do not automatically work).

Also, I am working on some mass collection of words to correlated them to tags so the number of tags to concepts should increase when I am finished. I will submit some additional tags for you if you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant