Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments for List Google Pages Indexed for SEO: Two Step How To #234

Open
phinjensen opened this issue Nov 11, 2017 · 8 comments
Open

Comments for List Google Pages Indexed for SEO: Two Step How To #234

phinjensen opened this issue Nov 11, 2017 · 8 comments

Comments

@phinjensen
Copy link
Contributor

phinjensen commented Nov 11, 2017

Comments for https://www.endpointdev.com/blog/2009/12/google-pages-indexed-seo/
By Steph Skardal

To enter a comment:

  1. Log in to GitHub
  2. Leave a comment on this issue.
@phinjensen
Copy link
Contributor Author

original author: Shane M Hansen
date: 2009-12-14T11:55:20-05:00

I'd suggest using curl and bash's {} grouping operators to stream content to sed like this. You could also check out my post on poor man's concurrency with bash to run several of these processes at once.

{ for i in http://www.google.com http://www.backcountry.com;do curl $i;done; } | sed -e '/stuff/d'

@phinjensen
Copy link
Contributor Author

original author: Steph Powell
date: 2009-12-14T12:22:26-05:00

Hi Shane,

Thanks for the suggestion. However, running:

{ for i in "http://www.google.com/search?num=100&as_sitesearch=www.endpoint.com"; do wget $i; done; } | ...

or

{ for i in "http://www.google.com/search?num=100&as_sitesearch=www.endpoint.com"; do curl $i; done; } | ...

triggers a 403 (forbidden) response, so it would require some hacking to get around forbidden script requests to Google. Perhaps I'll play around with the User Agent settings and find a way to successfully make curl requests in the future.

~Steph

@phinjensen
Copy link
Contributor Author

original author: Shane M Hansen
date: 2009-12-14T12:47:49-05:00

Good point. Google seems to require the user agent string. Using something as simple as
curl -A 'mozilla' 'http://www.google.com/search?num=100&as_sitesearch=www.endpoint.com'

seems to work. The linkscape api is a great tool for this sort of thing also.

@phinjensen
Copy link
Contributor Author

original author: Steph Powell
date: 2009-12-14T13:01:45-05:00

Yes, I've been thinking about working with the Linkscape API. According to the API docs, from the free (limited) API, you can grab:
* the mozRank of the page requested
* the number of external, juice-passing links
* the subdomain mozrank
* The total number of links (coming soon!)
* Domain Authority (coming soon!)
* Page Authority (coming soon!)
* The top 500 links sorted by Page Authority (coming soon!)
* The top 3 linking domains sorted by Domain Authority (coming soon!)
* The top 3 anchor texts to the site or page (coming soon!)

I would love to integrate the Linkscape API into my SEO workflow.

@phinjensen
Copy link
Contributor Author

original author: SEO Melbourne
date: 2010-03-31T03:00:26-04:00

this may be a stupid question, but how do i run the command?

Is it done through the browser?

@phinjensen
Copy link
Contributor Author

original author: Robert
date: 2010-07-26T20:44:41-04:00

Worked great just by typing the command where you would otherwise enter the URL address in your browser window.

@phinjensen
Copy link
Contributor Author

original author: Reiki Vancouver
date: 2011-05-28T16:01:42-04:00

Hi Steph,

I ran the sed command in terminal but don't know how to view the output of the url's you showed.

Would you have any advice for a novice???

Thanks,
Daniel

@phinjensen
Copy link
Contributor Author

original author: Steph Skardal
date: 2011-05-31T09:24:24-04:00

Reiki,

The results of the sed command should output directly into the terminal. If you want, you can output them into a file by appending "> filename" to the end of the command, and using a text editor (vi, emacs, notepad, gedit, etc.) to read the file.

~Steph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant